PROTOC ADHERIN MATERIALS AND METHODS 

This Application is a continuation-in-part of International Patent 
No. PCT/ISS93/ 12588 filed December 23, 1993 v/hich is in turn a 
continuation-in-part of U.S\Patent Application Serial No. OJ/998,003 which was 
filed on December 29, 1992.\ 

FIELD OF THE INVENTION 
The present invention relates, in general, to materials and methods 
relevant to cell-cell adhesion. More particularly, the invention relates to novel 
adhesion proteins, designated protocadherins, and to polynucleotide sequences 
encoding the protocadherins. The invention also relates to methods for inhibiting 
binding of the protocadherins to their natural ligands/antiligands. 

BACKGROUND 

In vivo, intercellular adhesion plays an important role in a wide 
range of events including morphogenesis and organ formation, leukocyte 
extravasion, tumor metastasis and invasion, and the formation of cell junctions. 
Additionally, cell-cell adhesion is crucial for the maintenance of tissue integrity. 

Intercellular adhesion is mediated by specific cell surface adhesion 
molecules. Cell adhesion molecules have been classified into at least four families 
including the immunoglobulin superfamily, the integrin superfamily, the selectin 
family and the cadherin superfamily. All cell types that form solid tissues express 
some members of the cadherin superfamily suggesting that cadherins are involved 
in selective adhesion of most cell types. 

Cadherins have been generally described as glycosylated integral 
membrane proteins that have an N-terminal extracellular domain (the N-terminal 
113 amino acids of the domain appear to be directly involved in binding) 
consisting of five subdomains characterized by sequences unique to cadherins, a 
hydrophobic membrane-spanning domain and a C-terminal cytoplasmic domain 
that interacts with the cytoskeleton through catenins and other cytoskeleton- 
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associated proteins. Some cadherins lack a cytoplasmic domain, however, and 
appear to function in cell-cell adhesion by a different mechanism than cadherins 
having a cytoplasmic domain. The cytoplasmic domain is required for the 
adhesive function of the extracellular domain in cadherins that do have an 
5 cytoplasmic domain. Binding between members of the cadherin family expressed 
on different cells is homophilic (i.e., a member of the cadherin family binds to 
cadherins of its own or a closely related subclass) and Ca 2 + -dependent. For 
recent reviews on cadherins, see Takeichi, Arum. Rev. Biochem., 59: 237-252 
(1990) and Takeichi, Science, 251: 1451-1455 (1991). 
10 The first cadherins to be described (E-cadherin in mouse epithelial 

cells, L-CAM in avian liver, uvomorulin in the mouse blastocyst, and CAM 
120/80 in human epithelial cells) were identified by their involvement in Ca 2+ - 
dependent cell adhesion and their unique immunological characteristics and tissue 
localization. With the later immunological identification of N-cadherin, which 
15 was found to have a different tissue distribution than E-cadherin, it became 

apparent that a new family of Ca 2+ -dependent cell-cell adhesion molecules had 
been discovered. 

The molecular cloning of the genes encoding E-cadherin [see 
Nagafuchi et al , Nature, 329: 341-343 (1987)], N-cadherin [Hatta et al , 7. Cell 
20 Biol, 106: 873-881 (1988)], and P-cadherin [Nose et al , EMBO J., 6: 3655-3661 
(1987)] provided structural evidence that the cadherins comprised a family of cell 
adhesion molecules. Cloning of L-CAM [Gallin et al t Proc. Natl Acad. f ScL 
USA, 84: 2808-2812 (1987)] and uvomorulin [Ringwald et al, EMBO J., 6: 
3647-3653 (1986)] revealed that they were identical to E-cadherin. Comparisons 
25 of the amino acid sequences of E-, N-, and P-cadherins showed a level of amino 

acid similarity of about 45%-58% among the three subclasses. Liaw et al, 
EMBO J., 9: 2701-2708 (1990) describes the use of PCR with degenerate 
oligonucleotides based on conserved regions of the E-, N- and P-cadherins to 
amplify N- and P-cadherin from a bovine microvascular endothelial cell cDNA. 
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The isolation by PCR of eight additional cadherins was reported in 
Suzuki et al., Cell Regulation, 2: 261-270 (1991). Subsequently, several other 
cadherins were described including R-cadherin [Inuzuka et al. , Neuron, 7: 69-79 
(1991)], M-cadherin [Donalies, Proc. Natl. Acad. Sci. USA, 88: 8024-8028 
5 (1991)], B-cadherin [Napolitano, J. Cell. Biol., 113: 893-905 (1991)] and T- 
cadherin [Ranscht, Neuron, 7: 391-402 (1991)]. 

Additionally, proteins distantly related to cadherins such as 
desmoglein [Goodwin et al. , Biochem. Biophys. Res. Commun., 173: 1224-1230 
(1990) and Koch et al., Eur. J. Cell Biol, 53: 1-12 (1990)] and the desmocollins 
10 ' [Holton et al. , J. Cell Science, 97: 239-246 (1990)] have been described. The 
extracellular domains of these molecules are structurally related to the 
extracellular domains of typical cadherins, but each has a unique cytoplasmic 
domain. Mahoney et al. , Cell, 67: 853-868 (1991) describes a tumor suppressor 
gene of Drosophila, called fat, that also encodes a cadherin-related protein. The 
15 fat tumor suppressor comprises 34 cadherin-like subdomains followed by four 
EGF-like repeats, a transmembrane domain, and a novel cytoplasmic domain. 
The identification of these cadherin-related proteins is evidence that a large 
superfamily characterized by a cadherin extracellular domain motif exists. 

Studies of the tissue expression of the various cadherin-related 
20 proteins reveal that each subclass of molecule has a unique tissue distribution 
pattern. For example, E-cadherin is found in epithelial cells while N-cadherin is 
found in neural and muscle cells. Expression of cadherin-related proteins, also 
appears to be spatially and temporally regulated during development because 
individual proteins appear to be expressed by specific cells and tissues at specific 
15 developmental stages [for review see Takeichi (1991), supra]. Both the ectopic 
expression of cadherin-related proteins and the inhibition of native expression of 
cadherin-related proteins hinders the formation of normal tissue structure [Detrick 
et al., Neuron, 4: 493-506 (1990); Fujimori et al., Development, 110: 97-104 
(1990); Kintner, Cell, 69: 225-236 (1992)]. 



The unique temporal and tissue expression pattern of the different 
cadherins and cadherin-related proteins is particularly significant when the role 
each subclass of proteins may play in vivo in normal events (e.g., the maintenance 
of the intestinal epithelial barrier) and in abnormal events (e.g., tumor metastasis 
or inflammation) is considered. Different subclasses or combinations of 
subclasses of cadherin-related proteins are likely to be responsible for different 
cell-cell adhesion events in which therapeutic detection and/or intervention may 
be desirable. For example, auto-antibodies from patients with pemphigus 
vulgaris, an autoimmune skin disease characterized by blister formation caused 
by loss of cell adhesion, react with a cadherin-related protein offering direct 
support for adhesion function of cadherins in vivo [Amagai et al. , Cell, 67: 869- 
877 (1991)]. Studies have also suggested that cadherins and cadherin-related 
proteins may have regulatory functions in addition to adhesive activity. 
Matsunaga et al, Nature. 334: 62-64 (1988) reports that N-cadherin has neurite 
outgrowth promoting activity. The Drosophila fat tumor supressor gene appears 
to regulate cell growth and supress tumor invasion as does mammalian E-cadherin 
[see Mahoney et al., supra; Frixen et al, J. Cell Biol, 113: 173-185 (1991); 
Chen et al., J. Cell, Biol, 77*319-327 (1991); and Vleminckx et al, Cell, 
55:107-119 (1991)]. Thus, therapeutic intervention in the regulatory activities of 
cadherin-related proteins expressed in specific tissues may be desirable. 

There thus continues to exist a need in the art for the identification 
and characterization of additional cadherin-related proteins which participate in 
cell-cell adhesion and/or regulatory events. Moreover, to the extent that cadherin- 
related proteins might form the basis for the development of therapeutic and 
diagnostic agents, it is essential that the genes encoding the proteins be cloned. 
Information about the DNA sequences and amino acid sequences encoding the 
cadherin-related proteins would provide for the large scale production of the 
proteins by recombinant techniques and for the identification of the tissues/cells 
naturally producing the proteins. Such sequence information would also permit 
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the preparation of antibody substances or other novel binding molecules 
specifically reactive with the cadherin-related proteins that may be useful in 
modulating the natural ligand/antiligand binding reactions in which the proteins 
are involved. 

SUMMARY OF THE TNVFNTTnfSf 

The present invention provides cadherin-related materials and 
methods that are relevant to cell-cell adhesion. In one of its aspects, the present 
invention provides purified and isolated polynucleotides (e.g., DNA and RNA, 
both sense and an ti sense strands) encoding the novel cell adhesion molecules 
designated herein as protocadherins, including protocadherin-42, protocadherin- 
43, protocadherin pc3, protocadherin pc4 and protocadherin pc5. Preferred 
polynucleotide sequences of the invention include genomic and cDNA sequences 
as well as wholly or partially synthesized DNA sequences, and biological replicas 
thereof (i.e., copies of the sequences made in vitro). Biologically active vectors 
15 comprising the polynucleotide sequences are also contemplated. 

Specifically illustrating protocadherin polynucleotide sequences of 
the present invention are the inserts in the plasmids pRC/RSV-pc42 and 
pRC/RS V-pc43 which were deposited with the American Type Culture Collection 
(ATCC), 12301 Parklawn Drive, Rockville, Maryland 20852 on December 16, 
1992 and were assigned ATCC Accession Nos. 69162 and 69163, respectively. 

The scientific value of the information contributed through the 
disclosures of the DNA and amino acid sequences of the present invention is 
manifest. For example, knowledge of the sequence of a partial or complete DNA 
encoding a protocadherin makes possible the isolation by standard DNA/DNA 
25 hybridization or PCR techniques of full length cDNA or genomic DNA sequences 
that encode the protein (or variants thereof) and, in the case of genomic DNA 
sequences, that specify protocadherin-specific regulatory sequences such as 
promoters, enhancers and the like. Alternatively, DNA sequences of the present 
invention may be chemically synthesized by conventional techniques. 



20 



6- 



10 



15 



20 



25 



Hybridization and PCR techiques also allow the isolation of DNAs encoding 
heterologous species proteins homologous to the protocadherins specifically 
illustrated herein. 

According to another aspect of the invention, host cells, especially 
eucaryotic and procaryotic cells, are stably transformed or transfected with the 
polynucleotide sequences of the invention in a manner allowing the expression of 
protocadherin polypeptides in the cells. Host cells expressing protocadherin 
polypeptide products, when grown in a suitable culture medium, are particularly 
useful for the large scale production of protocadherin polypeptides, fragments and 
variants thereby enabling the isolation of the desired polypeptide products from 
the cells or from the medium in which the cells are grown. 

The novel protocadherin protein products of the invention may be 
obtained as isolates from natural tissue sources, but are preferably produced by 
recombinant procedures involving the host cells of the invention. The products 
may be obtained in fully or partially glycosylated, partially or wholly de- 
glycosylated, or non-glycosylated forms depending on the host cell selected or 
recombinant production and/or post-isolation processing. 

Protocadherin variants according to the invention may comprise 
polypeptide analogs wherein one or more of the specified amino acids is deleted 
or replaced or wherein one or more non-naturally encoded amino acids are added: 
(1) without loss, and preferably with enhancement, of one or more of the 
biological activities or immunological characteristics specific for a protocadherin; 
or (2) with specific disablement of a particular ligand/antiligand binding function! 
Also contemplated by the present invention are antibody substances (e.g., 
monoclonal and polyclonal antibodies, chimeric and humanized antibodies,' 
antibody domains including Fab, Fab', F<ib% Fv or single variable domains,' 
and single chain antibodies) which are specific for the protocadherins of the 
invention. Antibody substances can be developed using isolated natural, 
recombinant or synthetic protocadherin polypeptide products or host cells 
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expressing such products on their surfaces. The antibody substances may be 
utilized for purifying protocadherin polypeptides of the invention, for determining 
tissue expression of polypeptides and as antagonists of the ligand/antiligand 
binding activities of the protocadherins. Specifically illustrating monoclonal 
antibodies of the present invention are the protocadherin-43 specific monoclonal 
antibodies produced by the hybridoma cell line designated 38I2C which was 
deposited with the ATCC on December 2, 1992 and was assigned ATCC 
Accession No. HB 11207. 

Numerous other aspects and advantages of the present invention 
will be apparent upon consideration of the following detailed description, 
reference being made to the drawing wherein FIGURE 1A-C is an alignment of 
protocadherin amino acid sequences of the invention with the amino acid 
sequences of N-cadherin and of the DrosopMo fat tumor suppressor. 

DETATTm DESCRTPTTOf Sf 
The present invention is illustrated by the following examples 
wherein Examples 1, 2 and 3 describe the isolation by PGR of protocadherin 
polynucleotide sequences. Example 3 also describes the chromosome localization 
of several protocadherin genes of the invention. Example 4 describes the isolation 
by DNA/DNA hybridization of additional protocadherin polynucleotide sequences 
of the present invention. Example 5 presents the construction of expression 
plasmids including polynucleotides encoding protocadherin-42 or protocadherin-43 
and the transfection of L cells with the plasmids. The generation of antibodies 
to protocadherin-42 and protocadherin-43 is described in Example 6. Example 
7 presents the results of immunoassays of transfected L cells for the expression 
of protocadherin-42 or protocadherin-43. Example 8 describes the cell 
aggregation properties of L cells transfected with protocadherin-42, protocadherin- 
43 or a chimeric protocadherin-43/E-cadherin molecule. The calcium-binding 
properties of pc43 are described in Example 9. The results of assays of various 
tissues and cell lines for the expression of protocadherin-42 and protocadherin-43 
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by Northern blot, Western blot and in situ hybridization are respectively presented 
in Examples 10, 11 and 12. Example 13 describes immunoprecipitation 
experiments identifying a 120 kDa protein that coprecipitates with protocadherin- 
43. 

Example 1 

The polymerase chain reaction (PCR) was used to isolate novel rat 
cDNA fragments encoding cadherin-related polypeptides. 
Design of PCR Primers 

Two regions of conserved amino acid sequence, one from the 
middle of the third cadherin extracellular subdomain (EC-3) and the other from 
the C-terminus of the fourth extracellular subdomain (EC-4), were identified by 
comparison of the published amino acid sequences for L-CAM (Gallin et al., 
supra), E-cadherin (Nagafuchi et al., supra), mouse P-cadherin (Nose e t al., 
supra), uvomonilin (Ringwald et al., supra), chicken N-cadherin (Hatta et al, 
supra), mouse N-cadherin [Miyatani et al, Science, 245:631-635 (1989)] and 
human P-cadherin [Shimoyama et al. , J. Cell. Biol., 709:1787-1794 (1989)], and 
the corresponding degenerate oligonucleotides respectively set out below in 
IUPAC-IUB Biochemical nomenclature were designed for use as PCR primers. 

Primer 1 (SEQ ID NO: 1) 

5' AARSSNNTNGAYTRYGA 3' 

Primer 2 (SEQ ID NO: 2) 

3' TTRCTRTTRCGNGGNNN 5' 
The degenerate oligonucleotides were synthesized using an Applied Biosystems 
model 380B DNA synthesizer (Foster City, California). 
Cloning of cDNA Seq uences hv PCR 

PCR was carried out in a manner similar to that described in 
Suzuki et al., Cell Regulation, 2: 261-270 (1991) on a rat brain cDNA 
preparation. Total RNA was prepared from rat brain by the guanidium 
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isothiocyanate/cesium chloride method described in Maniatis et al., pp. 196 in 
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, New York: Cold 
Spring Harbor Laboratory (1982). Brain poly(A) + RNAs were then isolated using 
a FastTrack* kit (Invitrogen, San Diego, California) and cDNA was prepared 
using a cDNA synthesis kit (Boehringer Mannheim Biochemicals, Indianapolis, 
Indiana). The PCR reaction was initiated by adding 2.5 units of Taq DNA 
polymerase (Boehringer Mannheim Biochemicals) to 100 ng template cDNA and 
10 M g of each primer, after which 35 reaction cycles of denaturation at 94 X for 
1.5 minutes, annealing at 45*C for 2 minutes, and polymerization at 72'C for 3 
minutes were carried out. Two major bands of about 450 base pairs (bp) and 130 
bp in size were found when the products of the PCR reaction were subjected to 
agarose gel electrophoresis. The 450 bp band corresponded to the expected length 
between the two primer sites corresponding to the middle of the third cadherin 
extracellular subdomain (ECO) and the carboxyl terminus of the fourth cadherin 
extracellular subdomain (EC-4), but the 130 bp band could not be predicted from 
any of the previously identified cadherin sequences. The 450 bp and 130 bp 
bands were extracted by a freezing and thawing method. The resulting fragments 
were phosphorylated at the 5' end with T4 polynucleotide kinase and subcloned 
by a blunt-end ligation into the Sma I site of M13mpl8 (Boehringer Mannheim 
Biochemicals) in a blunt end ligation for sequence analysis. Sequencing of the 
fragments was carried out by the dideoxynucleotide chain termination method 
using a Sequenase kit (United States Biochemicals, Cleveland, Ohio). DNAand 
amino acid sequence were analyzed using the Beckman Microgenie program 
(Fullerton, California). 
25 Analysis of f DNA Sftg n fTO 

Nineteen novel partial cDNA clones were isolated. The DNA and 
deduced amino acid sequences of the clones (including sequences corresponding 
to the PCR primers) are set out as follows: RAT-123 (SEQ ID NOs: 3 and 4 
respectively), RAT-212 (SEQ ID NOs: 5 and 6), RAT-214 (SEQ ID NOs: 7 and 
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8), RAT-216 (SEQ ID NOs: 9 and 10), RAT-218 (SEQ ID NOs: 11 and 12), 
RAT-224 (SEQ ID NOs: 13 and 14), RAT-312 (SEQ ID NOs: 15 and 16), RAT- 
313 (SEQ ID NOs: 17 and 18), RAT-314 (SEQ ID NOs: 19 and 20), RAT-315 
(SEQ ID NOs: 21 and 22), RAT-316 (SEQ ID NOs: 23 and 24), RAT-317 (SEQ 
ID NOs: 25 and 26), RAT-321 (SEQ ID NOs: 27 and 28), RAT-323 (SEQ ID 
NOs: 29 and 30), RAT-336 (SEQ ID NOs: 31 and 32), RAT-352 (SEQ ID NOs: 
33 and 34), RAT-411 (SEQ ID NOs: 35 and 36), RAT-413 (SEQ ID NOs: 37 and 
38), and RAT-551 (SEQ ID NOs: 39 and 40). 

The deduced amino acid sequences of the cDNA clones are 
homologous to, but distinct from the known cadherins. The cadherins described 
thus far have highly conserved, short amino acid sequences in the third 
extracellular subdomain (EC-3) including the consensus sequence D-Y-Eor D-F-E 
located at the middle region of the subdomain and the consensus sequence 
D-X-N-E-X-P-X-F (SEQ ID NO: 41) or D-X-D-E-X-P-X-F (SEQ ID NO: 42) at 
its end (Hatta et al., supra), while the corresponding sequences of other 
subdomains, except for the fifth extracellular subdomain (EC-5), are D-R-E and 
D-X-N-D-N-X-P-X-F (SEQ ID NO: 43), respectively. In contrast, the deduced 
amino acid sequences of the ne w clones that correspond to cadherin extracellular 
subdomains include the sequence D-Y-E or D-F-E at one end, but have the 
sequence D-X-N-D-N-X-P-X-F instead of D-X-N-E-X-P-X-F or 
D-X-D-E-X-P-X-F, at the other end. The polypeptides encoded by the partial 
clones are homologous to previously identified cadherins but did not show 
significant homology to any other sequences in Genbank. Therefore, the partial 
cDNAs appear to comprise a new subclass of cadherin-related molecules. 

Example 2 

Various cDNA fragments structurally similar to the rat cDNAs 
described in Example 1 were isolated from human, mouse, and Xenopus brain 
cDNA preparations and from Drosophila and C. elegans whole body cDNA 



preparations by PCR using Primers 1 and 2 as described in Example 1. The 
DNA and deduced amino acid sequences of the resulting PCR fragments 
(including sequences corresponding to the PCR primers) are set out as follows: 
MOUSE-321 (SEQ ID NOs: 44 and 45), MOUSE-322 (SEQ ID NOs: 46 and 47), 
MOUSE-324 (SEQ ID NOs: 48 and 49), MOUSE-326 (SEQ ID NOs: 50 and 51), 
HUMAN-11 (SEQ ID NOs: 52 and 53), HUMAN-13 (SEQ ID NOs: 54 and 55), 
HUMAN-21 (SEQ ID NOs: 56 and 57), HUMAN-24 (SEQ ID NOs: 58 and 59), 
HUMAN-32 (SEQ ID NOs: 60 and 61), HUMAN-42 (SEQ ID NOs: 62 and 63), 
HUMAN-43 (SEQ ID NOs: 64 and 65), HUMAN-212 (SEQ ID NOs: 66 and 
67), HUMAN-213 (SEQ ID NOs: 68 and 69), HUMAN-215 (SEQ ID NOs: 70 
and 71), HUMAN-223 (SEQ ID NOs: 72 and 73), HUMAN-410 (SEQ ID NOs: 
74 and 75), HUMAN-443 (SEQ ID NOs: 76 and 77), XENOPUS-21 (SEQ ID 
NOs: 78 and 79), XENOPUS-23 (SEQ ID NOs: 80 and 81), XENOPUS-25 (SEQ 
ID NOs: 82 and 83), XENOPUS-31 (SEQ ID NOs: 84 and 85), DROSOPHILA- 
12 (SEQ ID NOs: 86 and 87), DROSOPHILA-13 (SEQ ID NOs: 88 and 89), 
DROSOPHILA-14 (SEQ ID NOs: 90 and 91) and C.ELEGANS-41 (SEQ ID 
NOs: 92 and 93). Comparison of the deduced amino acid sequences indicates 
significant similarity between sets of these clones. In particular, there are three 
sets of clones that appear to be cross-species homologues: RAT-218, MOUSE-322 
and HUMAN-43; RAT-314, MOUSE-321 and HUMAN-11; and MOUSE-326 
and HUMAN-42. 



Example 3 

To ascertain the complete structure of the new proteins defined by 
the PCR products, two full length human cDNAs corresponding to the partial 
cDNAs HUMAN-42 and HUMAN-43 were isolated. 
Isolation nf Full-lenp th Human cDNA^ 

A human fetal brain cDNA library (Stratagene, La Jolla, 
California) in the XZapII vector was screened by the plaque hybridization method 
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[described in Ausubel et al., Eds., Current Protocols in Molecular Biology, 
Sections 6.1.1 to 6.1.4 and 6.2.1 to 6.2.3, John Wiley & Sons, New York 
(1987)] with 32 P-labelled HUMAN-42 and HUMAN-43 DNA fragments. The 
positive clones were plaque-purified and, using a helper virus, the inserts were 
cut out by an in vivo excision method in the form of a Bluescript SK(+) plasmid. 
The insert sequences were then subcloned into the M13 vector (Boehringer 
Mannheim, Biochemicals) for sequencing. Several overlapping cDNA clones 
were isolated with each probe including two cDNAs which contained the putative 
entire coding sequences of two novel proteins designated protocadherin-42 (pc42) 
and protocadherin-43 (pc43). The DNA and deduced amino acid sequences of 
pc42 are set out in SEQ ID NOs: 94 and 95, respectively, while the DNA and 
deduced amino acid sequences of pc43 are set out in SEQ ID NOs: 96 and 97, 
respectively. 

A description of the cloning of protocadherin sequences of the 
15 invention was published in Sano et al, The EMBO Journal, 12(6): 2249-2256 
(1993) after filing of the priority application hereto. The deduced amino acid 
sequence of pc43 was previously presented at the December 9, 1991 meeting of 
the American Society for Cell Biology. An abstract of the presentation is 
published as Suzuki et al., J. Cell. Biol., 115: 72a (Abstract 416) (December 9, 
20 1991). 

Analysis of F ull-length Human Clon^ 

Comparison of the full length cDNA sequences of pc42 andjpc43 
to the sequences of the various DNA fragments originally obtained by'pCR 
reveals that MOUSE-326 and HUMAN-42 correspond to a portion of the fourth 
25 extracellular subdomain (EC-4) of pc42, and RAT-314, MOUSE-321, and 
HUMAN- 11 correspond to a portion of the third extracellular subdomain (EC-3) 
of pc43 and RAT-218, MOUSE-322 and HUMAN-43 correspond to a portion of 
the fifth extracellular domain (EC-5) of pc43. 
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The overall structures of pc42 and pc43 are similar to that of 
typical cadherins but the new molecules also have distinct features. Both 
protocadherin cDNA sequences contain putative translation initiation sites and 
translated amino acid sequences start with typical signal sequences, but the clones 
lack the prosequences that are present in all known cadherin precursors. The 
cDNAs encode proteins having a large N-terminal extracellular domain and a 
relatively short C-terminal cytoplasmic domain connected by a transmembrane 
sequence. The extracellular domains of pc42 and pc43 are different in length and 
pc42 contains seven subdomains that closely resemble the typical cadherin 
extracellular subdomain while pc43 has six such subdomains. The sizes of the 
protocadherin cytoplasmic domains are similar to those of typical cadherins, but 
the sequences do not show any significant homology with those of known 
cadherins or cadherin-related proteins. 

Amino acid identity determinations between extracellular 
subdomains of human pc42 and pc43, and of mouse N-cadherin (SEQ ID NO: 98) 
(presented as an example of a "typical" cadherin) and the eighteenth extracellular 
subdomain of Drosophila fat tumor suppressor (EC- 18, SEQ ID NO: 99) (the 
eighteenth extracellular subdomain of fat is a prototypical fat subdomain) are 
presented in Table 1 below, wherein, for example, "N-EC-1 x pc42" indicates 
that the first extracellular subdomain of N-cadherin was compared to the 
extracellular subdomain of pc42 indicated on the horizonal axis. 
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Table 1 












EC-1 


EC-2 


EC-3 


EC-4 


EC-5 


EC-6 


EC-7 


N-EC-1 x pc42 


20 


27 


26 


26 


31 


29 


17 


N-EC-1 x pc43 


31 


23 


23 


26 


31 


24 




N-EC-2 x pc42 


28 


30 


32 


30 


37 


31 


19 


N-EC-2 x pc43 


30 


28 


30 


36 


29 


30 




N-EC-3 x pc42 


21 


26 


30 


29 


31 


30 


22 


N-EC-3 x pc43 


25 


18 


26 


28 


28 


25 




N-EC-4 x pc42 


28 


28 


26 


25 


29 


27 


17 


N-EC-4 x pc43 


21 


25 


28 


28 


29 


24 




N-EC-5 x pc42 


24 


21 


25 


24 


24 


19 


12 


N-EC-5 x pc43 


15 


21 


20 


20 


25 


16 




fat EC-18 x pc42 


22 


35 


32 


34 


42 


35 


19 


fat EC-18 x pc43 


32 


30 


36 


36 


33 


29 





The amino acid identity values between the extracellular subdomains of pc42 and 
pc43, and N-cadherin EC-1 through EC-5 and Drosophila fat EC-18 are mostly 
less than 40%. These identity values are comparable to the values between the 
subdomains of other cadherin subclasses. However, higher identity values 
indicate that pc42 and pc43 are more closely related to fat than to N-cadherin. 

Amino acid identity determinations between extracellular 
subdomains of human pc42 and pc43 are presented in Table 2 below. 
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Table 2 
pc42 

ECii ££^£^££4 E££ E£=6 E&7 

EC' 1 3 3 27 29 26 25 26 25 



EC-2 26 38 29 33 34 

EC-3 26 32 41 30 32 

EC-4 25 34 30 41 39 

EC-5 23 32 29 27 36 

EC-6 25 25 26 25 28 



28 21 

31 22 

31 18 

34 16 

23 26 



The identity values between respective EC-1, EC-2, EC-3, EC-4, EC-5 
subdomains and the last subdomains of pc42 and pc43 are generally higher values 
than values obtained for comparisons of the protocadherins to N-cadherin These 
results suggest that pc42 and pc43 are more closely related to one another than 
they are to classic cadherins. 

FIGURE 1A-C presents an alignment of the deduced amino acid 
sequences of the extracellular subdomains of pc42 (EC-1 through EC-7), pc43 
(EC-1 through EC-6), mouse N-cadherin (EC-1 through EC-5) z^Drosophilafat 
EC-18. A sequence on a line in FIGURE 1A continues on the same line in 
FIGURES IB and 1C. Gaps were introduced to maximize homology. The amino 
acid residues described by capital letters in the "motif line are present in more 
than half of the subdomains of N-cadherin, pc42, pc43 and Drosophila fat. The 
amino acid residues described by small letters in the motif line are less well 
conserved in human pc42, pc43, and Drosophilafat. FIGURE 1A-C shows that 
many amino acids characteristic of other cadherin extracellular domain repeats are 
conserved in the pc42 and pc43 sequences, including the cadherin sequence motifs 
DXD, DRE and DXNDNXPXF (SEQ ID NO: 43), two glycine residues, and one 
glutamic acid residue. Additionally, pc42 and pc43 share unique features in 
comparison to N-cadherin. More amino acids at specific sites are conserved 
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between pc42 and pc43, such as the DXDXGXN (SEQ ID NO: 100) 
protocadherin sequence motif near the amino terminus of the pc42 and pc43 
subdomains and the AXDXGXP (SEQ ID NO: 101) sequence motif near the 
carboxyl terminus of the subdomains. Additionally, both protocadherins share 
regions that do not show significant homology with the typical cadherin motif (of 
N-cadherin) near the carboxyl terminus of EC-1, in the middle of EC-2 and EC-4, 
and at the carboxyl terminus of the last repeat. A cysteine residue is located at 
a similar position in the middle of EC-4 of pc42 and pc43. In general, the 
extracellular subdomains of pc42 and pc43 are more similar to EC-18 of fat than 
the extracellular subdomains of N-cadherin. 
Possible Alternative *>p liring 

Sequence analysis of various overlapping protocadherin cDNA 
clones revealed that some clones contained unique sequences at the 3' end, 
although the 5' end sequences were identical to other clones. The sequences 
15 forming the boundaries of the 3' end regions are consistent with the consensus 
sequence of mRNA splicing, suggesting that these clones may correspond to 
alternatively spliced mRNAs. The DNA and deduced amino acid sequences of 
one possible product of alternative splicing of pc42 mRNA are set out in SEQ ID 
NOs: 102 and 103. The DNA and deduced amino acid sequences of two possible 
products of alternative splicing of pc43 mRNA are respectively presented in SEQ 
ID NO: 104 and 105, and SEQ ID NOs: 106 and 107. 
Chromosome LocaliT^rm 

The chromosomal location of the protocadherin 413 gene (SEQ ID 
NO: 37) and of the pc42 and pc43 genes was determined by conventional 
25 methods. 

Briefly, C3H/HeJ-^W and Mus spretus (Spain) mice and 
[(C3H/HeJ-$W x Mus spretus) F, x C3H/HeJ-*W) interspecies backcross mice 
were bred and maintained as previously described in Seldin, et al, J. Exp. Med., 
167: 688-693 (1988). Mus spretus was chosen as the second parent in the cross 
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because of the relative ease of detection of informative restriction fragment length 
variants (RFLVs) in comparison with crosses using conventional inbred laboratory 
strains. Gene linkage was determined by segregation analysis. 

Genomic DNA isolated from mouse organs by standard techniques 
was digested with restriction endonucleases and 10/xg samples were 
electrophoresed in 0.9% agarose gels. DNA was transferred to Nytran 
membranes (Schleicher & SchuU, Inc., Keene, NH), hybridized with the 
appropriate probe at 65'C and washed under stringent conditions, all as 
previously described in Maniatis et al., supra). To localize the pc42 gene, a 
mouse sequence probe corresponding to nucleotides 1419 to 1906 of SEQ ID NO: 
94 was used and for pc43 a rat sequence probe corresponding to nucleotides 1060 
to 1811 of SEQ ID NO: 96 was used. To localize the procadherin 413 gene, a 
probe including the sequence set out in SEQ ID NO: 37 was used. Other clones 
used as probes in the current study and RFLVs used to detect anonymous DNA 
loci were all previously described [Chromosome 7, DNA segment, Washington 
12 (D7Wasl2); the parathyroid hormone (Pth); calcitonin (Cole); hemoglobin, 0 
chain (Hbb); metallothionein-I (Mt-1); adenine phosphoribosyltransferase (Aprt); 
growth hormone receptor (Ghr); prostaglandin E receptor EP2 subtype 
(Ptgerep2); dihydrofolate reductase-2 (Dhfr2); fibroblast growth factor a (Fgfa); 
20 and glucocorticoid receptor- 1 (GrI-1)]. 

Comparison of the haplotype distribution of protocadherin genes 
with those determined for loci throughout the mouse genome allowed each to be 
mapped to specific regions of mouse chromosomes. The probability for linkage 
was >99% and indicated assignment of both the pc42 gene and the pc43 gene 
25 was chromosome 18. The assignment of the protocadherin 413 gene was 

chromosome 7. The region of chromosome 18 to which the pc42 and pc43 genes 
were mapped corresponds to the ataxia (ax) loci [Burt, Anax. Rec, 196: 61-69 
(1980) and Lyon, /. Hered., 46: 77-80 (1955)] and twirler (Tw) loci [Lyon, /. 
Embryol. Exp. Morphol, 6: 105-116 (1958)], while the region of chromosome 
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7 to which the protocadherin 413 gene was mapped corresponds to the shaker (sh- 
1) locus [Kikuchi et al. , Acta Oto-Laryngol. , 60: 287-303 (1965) and Lord et al. , 
Am. Nat., 63: 453-442 (1929)]. These loci have been implicated as involved in 
hereditary neural disease in the mouse. This result is consistent with in situ 
hybridization results (see Example 12) showing that pc42 and pc43 are strongly 
expressed in the brain and particularly in the cerebellum. 



Example 4 

Two additional novel human protocadherin cDNAs and one 
additional novel rat protocadherin cDNA were isolated using rat protocadherin 
fragments described in Example 1 as probes. 

Initially, the rat clone RAT-214 (SEQ ID NO: 7) was used as a 
probe to screen a rat brain cDNA library (Stratagene, La Jolla, CA). The final 
washing step was performed twice at 50* C in 0.1X SSC with 0.1% SDS for 15 
minutes. Various clones were identified which contained partial cDNA inserts 
encoding related protocadherin amino acid sequences. The nucleotide sequence 
of one novel rat clone designated #6-2 is set out in SEQ ID NO: 108. The first 
fifteen nucleotides of SEQ ID NO: 108 are the sequence of a linker and are not 
part of the rat #6-2 clone. 

A human fetal brain cDNA library obtained from Stratagene was 
screened with the 0.7 kbp PstI fragment of clone #6-2. The fragment appears to 
encode the EC-2 and EC-3 of the rat protocadherin. After screening about 2x10 s 
phages, eleven positive clones were isolated. Sequencing of the clones identified 
a novel full length human protocadherin cDNA designated human pc3. The 
nucleotide and deduced amino acid sequence of human pc3 are set out in SEQ ID 
NOs: 109 and 110. 

The 0.7 kbp PstI fragment of rat clone #6-2 was also used to 
rescreen the Stratagene rat brain cDNA library for full length rat cDNA clones. 
A clone containing an insert encoding a full length novel protocadherin cDNA 
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was isolated. The DNA and deduced amino acid sequence of the insert are set 
out in SEQ ID NO: 111 and 112. The full length rat cDNA was named pc5 
because it does not appear to be the homolog of the human pc3 clone based upon 
a comparison of the sequences. 

Concurrently, the 0.8 kbp Eco Rl-Pst I fragment of partial rat 
cDNA designated #43 (SEQ ID NO: 1 13), which was obtained by screening the 
Stratagene rat brain cDNA library with a probe corresponding to the human pc43 
cytoplasmic domain, was used to probe the Stratagene human cDNA library for 
full length human protocadherin cDNAs. The fragment appears to encode EC-3 
through the beginning of EC-6 of clone #43. One partial clone identified encodes 
a novel human protocadherin named human pc4. The nucleotide sequence and 
deduced amino acid sequences of the human pc4 clone are set out in SEQ ID 
NOs: 114 and 115. The amino acid sequence encoded by the pc4 clone appears 
to begin in the middle of EC-2 of pc4 and continues through the cytoplasmic tail 
of the protocadherin. 

Example 5 

The full length human cDNAs encoding pc42 and pc43 were 
expressed in L cells (ATCC CCL 1) using the pRC/RSV expression vector 
(Invitrogen, San Diego, California). The cDNAs were isolated from the 
Bluescript SK(+) clones described in Example 2 by digestion with Sspl followed 
by blunt-ending with DNA polymerase and digestion with Xbal (for pc42), or by 
double digestion with Spel and EcoRV (for pc43). The pRC/RSV expression 
vector was digested with Hindm, followed by blunt-ending and re-digestion with 
Xbal for insertion of pc42 sequences, or by digested with Xbal followed by 
blunt-ending and re-digestion with Spel for insertion of pc43 sequences. The 
isolated protocadherin DNAs were ligated into the linearized pRC/RSV vector. 
The resulting pc42 expression plasmid designated pRC/RSV-pc42 (ATCC 69162) 
and pc43 expression plasmid designated pRC/RSV-pc43 (ATCC 69163) were 
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purified by CsCl gradient centrifugation and transfected into L cells by a Ca- 
phosphate method. 

The pc42 and pc43 transfectants were morphologically similar to 
the parental cells. Northern blot analysis of L cells transfected with pc42 or pc43 
5 DNA sequences showed that the transfected cells expressed mRNAs of a size 
expected to encode the particular protocadherin. 

Examplf 

Rabbit polyclonal antibodies specific for pc42 and pc43 were 
generated as well as a mouse monoclonal antibody specific for pc43. 
10 Preparation of Polyclonal Antibodies S pecific for pc42 and twH 

DNA sequences encoding portions of the extracellular domain of 
pc42 and pc43 were each fused to a maltose binding protein-encoding sequence 
and expressed in bacteria. Specifically, DNAs corresponding to EC-4 through 
EC-7 of pc42 and EC-3 through EC-5 of pc43 were prepared by PCR and 
15 subcloned in the correct reading frame into the multicloning site of the pMAL 

expression vector (New England Biolabs, Beverly, Massachusetts) which contains 
sequences encoding maltose binding protein immediately upstream of the 
multicloning site. The resulting plasmids were then introduced into E. coli 
NM522 cells (Invitrogen, San Diego, California) by a single step transformation 
20 method. Expression of the fusion proteins was induced by the addition of IPTG 
and the fusion proteins were purified from cell extracts by amylose resin affinity 
chromatography (New England Biolabs) as described by the manufacturer. The 
fusion proteins were used for the immunization of rabbits without further 
purification. 

25 Polyclonal antibodies were prepared in rabbits by immunization at 

four subcutaneous sites with 500/ig of purified fusion protein in Freund's 
complete adjuvant. Subsequent immunizations with lOO/tg of the fusion protein 
were in Freund's incomplete adjuvant. Immune sera was passed through 
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sepharose coupled to maltose binding protein (New England Biolabs) and 
polyclonal antibodies were purified from immune sera using Sepharose affinity 
columns prepared by reaction of the purified fusion protein with CNBr Sepharose 
(Pharmacia). Reactivity of the polyclonal sera with purified pc42 fusion protein 
and pc42 transfected cell extracts (described in Example 5) was confirmed. 
Preparation of Mono clonal Antibodies Specific for pc43 

The pc43 fusion protein (containing the EC-3 through EC-5 
subdomains of pc43) was used to generate monoclonal antibodies in mice 
according to the method of Kennett, Methods in EnzymoL, 55:345-359 (1978). 
Briefly, mice were immunized with the pc43 fusion protein (lOOjzg) at two 
subcutaneous sites. The spleen from the highest titer mouse was fused to the NS1 
myeloma cell line. The resulting hybridoma supernatants were screened in a 
ELISA assay for reactivity with the pc43 fusion protein and with maltose binding 
protein. The fusion wells with the highest reactivity to the pc43 extracellular 
domains were subcloned. The hybridoma cell line designated 38I2C (ATCC HB 
11207) produced a lgG x subtype monoclonal antibody specific for pc43. 
Reactivity of the monoclonal antibody produced by hybridoma cell line 38I2C to 
pc43 was confirmed by immunoblotting the pc43 L cell transfectants described in 
Example 5. The 38I2C monoclonal antibody is specific for human pc43. 

Example 7 

L cells transfected with DNA sequences encoding pc42 and pc43 
as prepared in Example 5 were assayed for expression of the protocadherins by 
immunoblot and by immunofluorescence microscopy. 
Immunoblot Analysis 

Cell extracts of pc42 and pc43 transfectants were subjected to 
SDS-PAGE and then blotted electrophoretically onto a PVDF membrane 
(Millipore, Bedford, Massachusetts). The membranes were incubated with 5% 
skim milk in Tris-buffered saline (TBS) for two hours and then respectively with 
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either pc42 polyclonal sera or pc43 monoclonal antibody for one hour. The 
membranes were washed three times (for 5 minutes each wash) with TBS 
containing 0.05% Tween 20 and respectively incubated with alkaline 
phosphatase-conjugated anti-rabbit IgG antibody or anti-mouse IgG antibody 
(Promega, Madison, Wisconsin) in the same buffer for one hour. After washing 
the membranes with TBS containing 0.05% Tween 20, reactive bands were 
visualized by using Western Blue solution (Promega), 

Anti-pc42 polyclonal antibodies stained a band of about 170 kDa 
molecular weight in pc42 transfected cells, but not parental L cells. The pc43- 
specific monoclonal antibody (38I2C) and polyclonal antibodies stained two 
adjacent bands of about 150 kDa molecular weight in pc43 transfected cells. The 
pc43 antibodies did not stain bands in parental L-cells. The molecular weights 
indicated by the staining of bands by the pc42 and pc43 antibodies are 
significantly larger than the molecular weights predicted from the deduced amino 
acid sequences. This discrepancy in molecular weight is common among various 
cadherin-related proteins and may be attributable to the glycosylation and/or 
cadherin specific structural properties. The pc42 antibody also stained smaller 
bands, which may be proteolytic degradation products. 

When transfected cells were trypsinized and cell extracts were 
prepared, run on SDS/PAGE and immunoblotted with the appropriate antibody, 
the pc42 and pc43 polypeptides expressed by the transfected cells were found to 
be highly sensitive to proteolysis and were easily digested by 0.01% trypsin 
treatment. In contrast to the classic cadherins, however, these proteins were not 
protected from the digestion in the presence of l-5mM Ca 2 *. 
Immunofluorescence Microscopy 

Transfected cells were grown on a cover slip precoated with 
fibronectin and were fixed with 4% paraformaldehyde for 5 minutes at room 
temperature or with cold methanol on ice for 10 minutes followed by 4% 
paraformaldehyde fixation. After washing with TBS, the cells were incubated with 
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TBS containing 1% BSA for 30 minutes and then with anti-pc42 polyclonal 
antibody or anti-pc43 monoclonal antibody in TBS containing 1 % BSA for 1 hour 
at room temperature. Cover slips were then washed with TBS containing 0.01% 
BSA and respectively incubated with FITC -conjugated anti-rabbit antibody or 
anti-mouse antibody (Cappel, Durham, North Carolina) for 60 minutes at room 
temperature. The cells were washed again with TBS containing 0.01 % BSA and 
subjected to fluorescence microscopy. Both pc42-specific and pc43-specific 
polyclonal antibodies stained the cell periphery of transfected cells expressing the 
protocadherin proteins, mainly at the cell-cell contact sites. The antibodies did 
not stain the parent L cells, nor did rabbit preimmune sera stain the pc42 and 
pc43 trans fee tan Ls. 

Example 8 

The cell aggregation properties of the transfected L cells expressing 
protocadherin proteins were examined. Transfected L cells were cultured in 
Dulbecco's Modified Eagles Medium (DMEM) (Gibco, Grand Island, New York) 
supplemented with 10% fetal bovine serum at 37* C in 5% C0j. Cells grown near 
confluence were treated with 0.01 % trypsin in the presence of 1 mM EGTA for 
25 minutes on a rotary shaker at 37 *C and collected by centrifugation. The cells 
were washed three times with Ca 2 + free HEPES-buffered saline (HBS) after 
adding soybean trypsin inhibitor, and were resuspended in HBS containing 1 % 
BSA. The cell aggregation assay [Urushihara et al., Dev. Biol, 70: 206^216 
(1979)] was performed by incubating the resuspended cells in a 1:1 mixture of 
DMEM and HBS containing 1% BSA, 2 mM CaCl 2 and 20 /xg/ml of 
deoxyribonucelease on a rotary shaker at 37 *C for 30 minutes to 6 hours. 

The pc42 and pc43 transfectants did not show any significant cell 
aggregation activity during periods of incubation less than 1 hour. This is in 
contrast to the cell aggregation that occurs with classic cadherins in similar 
experiments (Nagafuchi et al, supra, and Hatta et al, supra). However, 
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prolonged incubation of transfected cells (more than 1-2 hours) resulted in gradual 
re-aggregation of the cells into small aggregates. Similar results were obtained 
when single cell suspensions of transfected cells were prepared by trypsin 
treatment in the presence of Ca 2+ . No re-aggregation was observed under the 
same conditions when untransfected L cells or L cells transfected with pRC/RSV 
vector alone were tested. When pc43 transfectants labelled with DiO (Molecular 
Probes, Eugene, OR) were incubated with unlabelled pc42 transfectants in the cell 
aggregation assay, aggregation of labelled and unlabelled cells was almost 
mutually exclusive indicating that protocadherin binding is homophilic. 

In view of the fact that the protocadherin cytoplasmic domains 
exhibit no apparent homology to cadherin domains, experiments were performed 
to determine if the difference in cytoplasmic domains could account for the 
difference in cell aggregation activity observed in cadherin and protocadherin 
transfectants. The cytoplasmic domain of pc43 was replaced with the cytoplasmic 
domain of E-cadherin and aggregation of cells transfected with the chimeric 
construct was analyzed. 

The Bluescript SK(+) clone described in Example 2 which 
contained the entire coding sequence for pc43 was digested with EcoRV and then 
partially digested with Xbal to remove the sequence corresponding to the 
cytoplasmic domain, and the plasmid DNA was purified by agarose gel 
electrophoresis. The cDNA corresponding to the cytoplasmic domain of mouse 
E-cadherin was synthesized by PCR using mouse cDNA made from mouse lung 
mRNA as a template and specific primers corresponding to a region near the N- 
terminus of the cytoplasmic domain sequence or the region containing the stop 
codon of mouse E-cadherin (Nagafuchi et al., supra). A Xbal sequence was 
included to the 5' end of the upstream primer. The E-cadherin cytoplasmic 
domain cDNA was then subcloned into the linearized pc43 Bluescript clone. The 
DNA containing the entire resulting chimeric sequence was cut out with Spel and 
EcoRV and was subcloned into the Spel-blunted Xbal site of the expression vector 
pRc/RS V vector. Finally, L cells were transfected with the resultant construct by 
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a calcium phosphate method. After screening with G418 for about 10 days, the 
transfectants were stained with FITC-labeled 38I2C anti-pc43 antibody and 
subjected to FACS analysis. A portion of highly labeled cells were isolated and 
cloned. Transfectants showed a morphology similar to that of parental L cells and 
the expressed protein was localized at the cell periphery using pc43 antibody for 
immunofluorescence microscopy. 

CeU aggregation activity of the chimeric transfectants was analyzed 
as follows. The chimeric pc43 transfectants were labeled with DiO for 20 
minutes at room temperature. The resultant cells were trypsinized in the presence 
of ImM EGTA and single cell suspension was made. Then, the cells were mixed 
with unlabeled other type of transfectants and incubated on a rotary shaker for two 
hours. The results were examined with a fluorescence and a phase contrast 
microscope apparatus. Antibody inhibition of cell aggregation was examined by 
incubation of the transfectants in the presence of polyclonal anti-pc43 antibody 
15 (100 ng/ml) in the standard assay medium. 

In the cell aggregation assay, the chimeric pc43 transfectants 
showed clear Ca 2 '-dependent cell aggregation within forty minutes of incubation. 
CeU aggregation was inhibited by the addition of pc43-specific polyclonal 
antibody. 

20 Example 9 

The procedures of Maruyama et a/., /. Biochem., 95: 511-519 
(1984) were used to determine the calcium binding properties of pc43 by Western 
blot analysis in the presence or absence of calcium-45. The pc43 fusion protein 
described in Example 6 containing pc43 subdomains EC-3 through EC-5 was 
compared to the calcium binding protein calmodulin. Samples of purified pc43 
fusion protein were run on SDS/PAGE and electrophoreticaUy transferred to 
PVDF membrane. Binding of the 45 Ca 2+ to the pc43 fusion protein was detected 
by autoradiography and was determined to be nearly as efficient as binding 
of s Ca 2+ to calmodulin. In contrast, there was no binding of calcium to purified 
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maltose binding protein lacking the pc43 extracellular domain. The pc43 
subdomains EC-3 through EC-5 contain sequences highly homologous to the 
putative Ca 2± binding motifs found in E-cadherin. [See, Ringwald et al. , EMBO 
J., 6: 3647-3653 (1987).] 

Example 10 

The expression of mRNA encoding pc42 and pc43 was assayed in 
various tissues and cell lines by Northern blot. 

Total RNAs were prepared by the guanidium isothiocyanate method 
and poly(A)+ RNAs were isolated using a FastTrack kit (Invitrogen). RNA 
preparations were electrophoresed in a 0.8% agarose gel under denaturing 
conditions and transferred onto a nitrocellulose filter using a capillary method. 
Northern blot analyses were performed according to the method of Thomas, Proc. 
Natl. Acad. Sci. USA, 77: 5201-5205 (1980). The final wash was in 0.2X 
standard saline citrate containing 0.1% sodium dodecyl sulfate at 65 *C for 10 
minutes. 

Protocadherin mR NA Expression in Adult Rat Tissues 

Total mRNA preparations of rat tissues including brain, heart, 
liver, lung, skin, kidney and muscle were separated electrophoretically under 
denaturing conditions (10 ng mRNA/lane) and transferred onto nitrocellulose 
filters. The filters were hybridized with 32 P-labelled cDNA fragments MOUSE- 
326 (which corresponds to EC-4 of human pc42) and RAT-218 (which 
corresponds to EC-5 of human pc43). The mRNAs of both protocadherins were 
highly expressed in brain. The pc42 probe detected a major band of 7 kb and a 
minor band of 4 kb in size, possibly representing the products of alternative 
splicing. The pc43 probe hybridized to a major band of 5 kb in size and with 
minor bands of smaller sizes. 

Developmental Expressi on of Protncadherin mRNA in Rat Brain 

To examine the developmental regulation of mRNA expression of 
the protocadherins, brain mRNA from rats at embryonic days 17 and 20, neonatal 
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days 5 and 11 and from adult rats was prepared and subjected to Northern blot 
analysis as described above for other rat tissues. 0-actin was used as an internal 
standard. mRNA levels for pc42 and pc43 proteins increased during embryonic 
development of the brain as compared with /S-actin expression. 
Protocadherin mRN A Expression in Human Cell Lines 

Several neuronal and glial cell lines (including human SK-N-SH 
neuroblastoma, human U251 glioma, and mouse Neuro-2a neuroblastoma cell 
lines) were assayed by Northern blot using "P-labelled for expression of pc42 and 
pc43 mRNA. Human cell lines were probed with HUMAN-42 (which 
corresponds to EC-4 of human pc42) and HUMAN-43 (which corresponds to EC- 
5 of human pc43) cDNA fragments while the mouse cell line was probed with 
MOUSE-326 (which corresponds to EC-4 of human pc42) and RAT-322 (which 
corresponds to EC-5 of human pc43) cDNA fragments. SK-N-SH human 
neuroblastoma cells and U251 human glioma cells were found to express pc43 
mRNA and Neuro-2a mouse neuroblastoma cells were found to express pc42 
mRNA. 

Example 11 

Expression of pc43 protein in various tissues, extracts and cells was 
assayed by Western blot and immunofluorescence microscopy. 
Expressio n in Rat Cardiac Muscle Extracts 

A rat heart non-ionic detergent extract was prepared by freezing a 
heart in liquid nitrogen after removal, powdering in a mortar and pestle, grinding 
briefly in a polytron in 0.5% Nonidet P40 in [10 mM PIPES (pH 6.8), 50 mM 
NaCl, 250 mM NI^SO,, 300 mM sucrose, 3 mM MgClJ and microfuging for 15 
minutes. Samples were separated by SDS/PAGE and electrophoretically 
transferred to nitrocellulose (Towbin et al, PNAS 75/4350-4354, 1979). Two 
pc43 protein bands with molecular weights of 150 KDa and 140 KDa were 
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detected with rabbit polyclonal antibodies to pc43 by the immunoblot method 

described in Example 7. 

Expression in Tissue Sections and Cells 

To determine the localization of the protocadherins in various 
5 tissues, human and rat adult tissues were removed, incubated in 30% sucrose in 
PBS for 30 minutes at 4*C, embedded in OCT compound (Tissue-Tek, Elkhart, 
Indiana) in cryomolds and quickly frozen. Six micron sections were cut and 
placed on glass slides. The slides were washed with PBS and fixed in 3% p- 
formaldehyde for 5 minutes. To permeablize the tissue sections, the slides were 
10 immersed in -20 *C acetone for 10 minutes and air dried. The sections were 

blocked with 2% goat serum and 1% BSA in PBS for 30 minutes and then 
incubated with the rabbit anti-pc43 polyclonal antisera for 1 hour at room 
temperature. The sections were rinsed 3 times in PBS containing 0.1% BSA and 
incubated with a biotinylated anti-rabbit (Vector Laboratories, Burlingame, 
15 California) in 1 % BSA in PBS for 30 minutes. After rinsing 3 times, strepavidin- 

conjugated with FITC (Vector Laboratories) was added for 30 minutes and again 
washed 3 times. For co-localization studies, an appropriate primary antibody was 
used with a TRITC-conjugated secondary antibody. 
A. Muscle 

Immunolocalization of pc43 in rat cardiac muscle shows that,.pc43 
is localized in a repeating pattern which is consistent with pc43 being associated 
with the sarcomeres. Sarcomeres are repetitive contractile units between the 
fascia adherens in skeletal and cardiac muscle. Co-localization with cytoskeletal 
proteins shows that pc43 is present at the ends of the sarcomeres in the Z lines 
25 which are associated with desmin and the actin-binding protein vinculin, and 
alpha-actinin. The thin microfilaments of F-actin are associated with the thick 
myosin filaments between the Z lines. In contrast, N-cadherin is localized at the 
ends of cardiac myocytes at the fascia adherens junctions at sites of 
mycocyte: myocyte contact. The localization of pc43 in cardiac muscle suggests 
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that pc43 may play a role in muscle contraction in the anchoring of the contractile 
apparatus to the plasma membrane. 

Similar localization for pc43 was observed in rat skeletal muscle. 
Ultrastructural studies have shown that dystrophin, the gene product lacking in 
Duchenne muscular dystrophy, is a component of the sarcolemma [Porter et al, 
J. Cell. Biol, 777:997-1005 (1992)]. The sarcolemma is connected to the" 
contractile apparatus at the M and Z lines where pc43 is localized. 

B. Brain 

Reactivity of anti-pc43 polyclonal antibody and monoclonal 
antibody 38I2C on frozen sections of rat and human cerebellum, respectively, 
shows that the major sites of pc43 expression are located in Purkinje cells and the 
granule cell layer which contains numerous small neurons. 

C. Placenta 

Strong reactivity of monoclonal antibody 38I2C with human 
syncytiotrophoblasts was also observed in development of the placenta at an early 
state (5-7 weeks of gestation). Expression appeared to gradually decrease as the 
stage progressed indicating that pc43 may be involved in the implantation of 
fertilized eggs into the placenta. 

D - Neuroblast oma and Astrocytoma f p11g 
20 Immunocytochemical localization of pc43 in Sk-N-SH 

neuroblastoma cells and UW28 astrocytoma cells using anti-pc43 antibodies 
reveals a punctate cell surface distribution of pc43 and in some cells there is a 
localization at the tips of extensions of neuronal foot processes. At sites of cell- 
cell contact of UW28 astrocytoma cells, pc43 is organized in a series of parallel 
lines. The lines start at the contact site and extend approximately 5 micron. F- 
actin microfilaments were identified with rhodamine-phalloidin (Molecular Probes, 
Eugene, Oregon, as described by the manufacturer) showing that the 
microfilaments in the cell appear to end in the pc43 linear structures which extend 
from the edge of the cell at sites of cell contact. 
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Immunoblotting studies with pc43 specific antibodies show that a 
protein with a molecular weight of 140 kDa is recognized in human Sk-N-SH 
neuroblastoma cells and in UW28 astrocytoma cells. 

E. Osteoblast? 

Immunocytochemical localization of pc43 using monoclonal 
antibody 38I2C in tow human ostogenic sarcoma cell lines [SaOS (ATCC HTB 
85) and MG-63 (ATCC CRL 1427)] and in cultures of normal human trabecular 
osteoblasts [culture system described in Civitelli et al , J. Clin. Invest., 91: 1888- 
1896 (1993)] showed that pc43 is expressed in osteoblasts in a pattern similar to 
that seen in UW28 astrocytoma cells. At sites of cell-cell contact, pc43 is 
organized in a series of parallel lines that appear to correspond to the actin stress 
fibers. In addition, in some cells, pc43 appears to localize at the tips of 
contacting cell processes. Northern blot analysis provides additional evidence that 
pc43 is expressed in normal human trabecular osteoblasts, A pc43 specific DNA 
probe hybridized to a major band of 5 kb in samples of poly-A mRNA isolated 
from normal human trabecular osteoblasts. 

Example 12 

In situ hybridization experiments using protocadherin specific RNA 
probes were performed on cryosections of rat tissue. 

Sense and antisense 35 S-riboprobes were made using the standard 
procedure described by Promega (Madison, Wisconsin). An approximately 400 
bp EcoRI-Xbal fragment of the MOUSE-326 cDNA clone was used as a pc42 
specific probe. This fragment encodes the middle of EC-3 to the end of EC-4 of 
pc42. An approximately 700 bp Smal fragment of the RAT-218 cDNA clone was 
used as a pc43 specific probe. The fragment encodes the end of EC-3 to the end 
of EC-5 of pc43. 

Rat adult tissues were harvested and immediately embedded with 
OCT Compound (Tissue-Tek) in cryomolds and quickly frozen in a bath of 95% 
ethanol/dry ice. The frozen blocks were stored at -80 # C until cut. Six micron 
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tissue sections were cut using a cryostat (Reichert-Jung, Model #2800 Frigocut 
N, Leica, Inc., Gilroy, California). Cut tissue sections were stored at -80 *C. 

The in situ protocol used was a variation of that described by 
Angerer et al. , Methods in Enzymology, 152: 649-660, (1987). All solutions were 
treated with diethylpyrocarbonate (DEPC, Sigma, St. Louis, Missouri) to remove 
RNase contamination. The tissue sections were first fixed in 4% 
paraformaldehyde at 4*C for 20 minutes. To remove excess paraformaldehyde 
and stop the tissue fixation, the slides were washed in PBS (phosphate buffered 
saline), denatured in a graded series of alcohols (70, 95, 100%) and then dried. 
To prevent the tissue from detaching from the glass slide during the in situ 
procedure, the tissue sections were treated in a poly-L-lysine solution (Sigma) at 
room temperature for 10 minutes. To denature all RNA in the tissue, the sections 
were placed in a solution of 70% formamide/2x SSC (0.15 M NaCl/0.3 M Na 
citrate, pH 7.0) at 70*C for 2 minutes after which they were rinsed in chilled 2x 
SSC, dehydrated in a graded series of alcohols and then dried. Once dried, the 
sections were prehybridized in hybridization buffer [50% formamide/50 mM DTT 
(dithiothrietol)/0.3M NaCl/20 mM Tris, pH 8.0/5 mM EDTA/1X Denhardt's 
(0.02% Ficoll Type 400/0.02% polyvinylpyrrolidone/0.02% BSA)/10% Dextran 
Sulfate] at the final hybridization temperature for approximately 4 hours. After 
prehybridization, approximately 1 X 10* cpm of the appropriate riboprob? was 
added to each section. The sections were generally hybridized at 45 *C overnight 
(12-16 hours). To insure that the hybridization seen was specific, in some 
experiments the hybridization stringency was increased by raising the 
hybridization temperature to 50* C. As both the 45* C and 50* C experiments gave 
comparable results, the standard hybridization temperature used was 45 *C. 

To remove excess, nonhybridized probe, the sections were put 
through a series of washes. The sections were first rinsed in 4X SSC to remove 
the bulk of the hybridization solution and probe. Next a 15 minute wash in 4X 
SSC/50 mM DTT was carried out at room temperature. Washes at increased 
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stringencies were also utilized. A 40 minute wash in 50% formamide/2X SSC/50 
mM DTT was performed at 60* C. Four final room temperature washes were 
carried out for 10 minutes each: two in 2X SSC and two in 0.1X SSC. The 
washed slides were dehydrated in a graded series of alcohols and dried. 

To visualize the hybridized probe, the slides were dipped in Kodak 
NTB2 nuclear emulsion (International Biotechnology, New Haven, Connecticut) 
which had been diluted 1:1 in dH 2 0. Once dry, the slides were stored at 4 'C in 
light-tight boxes for the appropriate exposure time. The in situ slides were 
independently viewed by two persons and scored positive or negative for 
hybridization signal. 

All in situ hybridization studies were performed on rat tissue. 
Because results from Northern blot experiments (see Example 9) indicated that 
both pc42 and pc43 are expressed in adult brain, in situ hybridization studies were 
carried out to localize the expression of these molecules to specific brain cell 
types. Hybridization seen in the normal adult rat brian was specific (no 
background hybridization was seen with the sense probes) and was localized to 
specific regions in the brain. The overall pattern of expression seen for pc42 and 
pc43 was very similar, with the major difference being in the level of expression. 
pc43 appears to be expressed at a lower level than pc42. Both molecules are 
expressed in the germinal and pyramidal cells of the hippocampus, Puridnj>cells 
of the cerebellum and neurons in grey matter. In addition, pc42 is expressed in 
glial cells in the white matter but, in contrast to the expression of pc43 in glioma 
cell lines (as described in Example 9), expression of pc43 in normal glial cells 
was not observed. In the spinal chord, both protocadherins are expressed in the 
motor neurons in the gray matter and pc42 is expressed in the glial cells in the 
white matter. 

When expression of both protocadherin molecules was analyzed in 
brains and spinal chords from rats having EAE (experimental allergic 
encephalomyelitis) [Vandenbark et al., Cell. Immunol, 12: 85-93 (1974)], the 
same structures as described above were found to be positive. In addition, 
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expression of pc42 was observed in the leukocytic infiltrates in the EAE tissues. 
Expression of pc42 in leukocytes was confirmed by in situ hybridization analysis 
of two leukocytic cell lines, RBL-1 and y3. 

Expression of both protocadherin-42 and -43 was observed in the 
developing brain of rat embryos at all embryological days tested (E15-E19). In 
addition protocadherin-43 was observed in the developing rat heart at all 
embryological days tested (E13-E19). This finding is consistent with the 
immunohistochemistry results showing protocadherin-43 expression in adult heart. 

To determine possible roles of protocadherins in the development 
of the nervous system, expression profiles of protocadherin members in 
developing rat brain and adult rat brain were also examined by in situ 
hybridization. A series of coronal, sagittal and horizontal sections of rat brains 
at postnatal days 0, 6, 14, 30 (P0 through P30) and at 3 months (young adult) 
were hybridized with labelled cRNA probes corresponding to various 
protocadherins of the invention including pc42, pc43, RAT-212, RAT-411, and 
RAT-418. In developing brain, RAT-411 was expressed at high levels in neurons 
of the olfactory bulb, i.e., mitral cells and periglomerular cells. The expression 
of RAT-411 mRNA was transient; expression appeared at P0, peaked at P6, 
diminished by P14, and was undetectable at P30 and in adult brain. In the adult, 
pc43 mRNA was found to be expressed predominantly in Purkinje cells ih the 
cerebellum. The expression of pc43 mRNA in Purkinje cells was observed from 
the beginning of Purkinje cell differentiation at around P6. Other protocadherin 
members were expressed at very low levels in various areas of developing and 
adult brains. These results indicate that protocadherin members are differentially 
expressed during the development of the central nervous system, and suggest that 
RAT-411 and pc43 have specific roles during the development of olfactory bulb 
neurons and Purkinje cells, respectively. 
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Example 13 

Conventional immunoprecipitations using pc43-specific polyclonal 
antibodies and monoclonal antibody 38I2C were performed to identify proteins 
that interacted with pc43 in L cell transfectants. 

The pc43 and chimeric pc43 transfectants were metabolically 
labeled by incubating the cells in Dulbecco's modified Eagle's medium containing 
[ 35 S] methionine (50 uCi/ml) overnight. After washing, the transfectants were 
lysed with PBS containing Triton X 100 and incubated with anti-pc43 antibody. 
The immunocomplexes were then collected using protein A-Sepharose beads. The 
resulting beads were washed five times with a washing buffer (50mM Tris-HCl, 
pH 8.0, containing 0.5M NaCl, 0.1% ovalbumin, 0.5% NP-40, 0.5% Triton X 
100 and ImM EDTA) at room temperature. Protein was separated by SDS-PAGE 
and subjected to autoradiography. 

The chimeric pc43 co-precipitated with 105 kDa and a 95 kDa 
bands that are likely to correspond to a- and 0-catenins, respectively, because 
anti-a-catenin and anti-0-catenin antibodies stained comparable bands. Pc43, on 
the other hand, co-precipitated with a 120 kDa band. 

While the present invention has been described in terms of specific 
methods and compositions, it is understood that variations and modifications will 
occur to those skilled in the art. Therefore, only such limitations as appear in the 
claims should be placed on the invention. 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
AARSSNNTNG AYTRYGA 17 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 
TTRCTRTTRC GNGGNNN 17 
(2) INFORMATION FOR SEQ ID NO: 3: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
AAGGGAGTGG ACTTTGAGGA GCAGCCTGAG CTTAGTCTCA TCCTCACGGC TTTGGATGGA 60 
GGGACTCCAT CCAGGTCTGG GACTGCATTG GTTCAAGTGG AAGTCATAGA TGCCAATGAC 120 
AACGCACCGT A 131 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Lys Gly Val Asp Phe Glu Glu Gin Pro Glu Leu Ser Leu lie Leu Thr 
1 5 io 15 

Ala Leu Asp Gly Gly Thr Pro Ser Arg Ser Gly Thr Ala Leu Val Gin 
20 25 30 

Val Glu Val lie Asp Ala Asn Asp Asn Ala Pro 
35 40 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 : 
AAACGCATGG ATTTCGAGGA GTCTTCCTCC TACCAGATCT ATGTGCAAGC TACTGACCGG 60 
GGACCAGTAC CCATGGCGGG TCATTGCAAG GTGTTGGTGC ACATTATAGA TGTGAACGAC 120 
AACGCACCTA A 131 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

Lys Ala Met Asp Phe Glu Glu Ser Ser Ser Tyr Gin lie Tyr Val Gin 
15 10 15 

Ala Thr Asp Arg Gly Pro Val Pro Met Ala Gly His Cys Lys Val Leu 
20 25 30 

Val Asp He He Asp Val Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
AAGCGACTGG ACTTTGAGAC CCTGCAGACC TTCGAGTTCA GCGTGGGTGC CACAGACCAT 
GGCTCCCCCT CGCTCCGCAG TCAGGCTCTG GTGCGCGTGG TGGTGCTGGA CCACAATGAC 



60 
120 
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AATGCCCCCA A 131 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

Lys Arg Leu Asp Phe Glu Thr Leu Gin Thr Phe Glu Phe Ser Val Gly 
15 10 15 

Ala Thr Asp His Gly Ser Pro Ser Leu Arg Ser Gin Ala Leu Val Arg 
20 25 30 

Val Val Val Leu Asp His Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 
AAGGGCCTGG ATTACGAGGC ACTGCAGTCC TTCGAGTTCT ACGTGGGCGC TACAGATGGA 60 
GGCTCACCCG CGCTCAGCAG CCAGACTCTG GTGCGGATGG TGGTGCTGGA TGACAACGAC 120 
AACGCCCCTA A 131 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:- 

Lys Gly Leu Asp Tyr Glu Ala Leu Gin Ser Phe Glu Phe Tyr Val Gly 
1 5 10 15 
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Ala Thr Asp Gly Gly Ser Pro Ala Leu Ser Ser Gin Thr Leu Val Arg 
20 25 30 

Met Val Val Leu Asp Asp Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 
AAGGCGTTTG ATTTTGAGGA TCAGAGAGAG TTCCAGCTAA CCGCTCATAT AAACGACGGA 60 
GGTACCCCGG TTTTGGCCAC CAACATCAGC GTGAACATAT TTGTTACTGA CCGCAATGAC 120 
AACGCCCCGC A 131 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: Bingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Lys Ala Phe Asp Phe Glu Asp Gin Arg Glu Phe Gin Leu Thr Ala His/ 
15 10 15 

He Asn Asp Gly Gly Thr Pro Val Leu Ala Thr Asn He Ser Val Asn 
20 25 30 

He Phe Val Thr Asp Arg Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
AAGGCGGTGG ATTACGAAAT CACCAAGTCC TATGAGATAG ATGTTCAAGC CCAAGATCTG 60 
GGTCCCAATT CTATTCCTGC TCATTGCAAA ATTATAATTA AGGTCGTGGA TGTCAACGAC 120 
AACGCTCCCA A 131 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Lys Ala Val Asp Tyr Glu lie Thr Lys Ser Tyr Glu lie Asp Val Gin 
1 5 10 15 

Ala Gin Asp Leu Gly Pro Asn Ser lie Pro Ala His Cys Lys lie lie 
20 25 30 

lie Lys Val Val Asp Val Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 135 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TATGACCATG ATTACGAGAC AACCAAAGAA TATACACTGC GGATCCGGGC CCAGGATGGT 60 
GGCCGGACTC CACTTTCCAA CGTCTCCGGT CTAGTAACCG TGCAGGTCCT AGACATCAAC 120 
GACAATGCCC CCCCA 135 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Tyr Asp His Asp Tyr Glu Thr Thr Lys Glu Tyr Thr Leu Arg lie Arg 
15 10 15 

Ala Gin Asp Gly Gly Arg Thr Pro Leu Ser Asn Val Ser Gly Leu Val 
20 25 30 

Thr Val Gin Val Leu Asp lie Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 129 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: 
GGGGGGTCGA TTACGAGGAG AACGGCATGT TAGAGATCGA CGTGCAGGCC AGAGACCTAG 60 
GACCTAACCC AATTCCAGCC CATTGCAAGG TCACAGTCAA GCTCATCGAC CGCAATGATA 120 
ACGCCCCCA 129 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Arg Gly Val Asp Tyr Glu Glu Asn Gly Met Leu Glu He Asp Val Gin 
1 5 10 15 

Ala Arg Asp Leu Gly Pro Asn Pro He Pro Ala His Cys Lye Val Thr 
20 25 30 

Val Lys Leu He Asp Arg Asn Asp Asn Ala Pro 
35 40 
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(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
AAGGGGTTGG ACTACGAAGA CACCAAACTC CATGAGATTT ACATCCAGGC CAAAGACAAA 60 
GGTGCCAATC CGGAAGGAGC G C ATTG C AAA GTACTGGTAG AGGTTGTGGA CGTTAACGAC 120 
AATGCCCCTC A 131 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Lys Gly Leu Asp Tyr Glu Asp Thr Lys Leu His Glu lie Tyr lie Gin 
1 5 10 15 

Ala Lys Asp Lys Gly Ala Asn Pro Glu Gly Ala His Cys Lys Val Leu 
20 25 30 

Val Glu Val Val Asp Val Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 
AAGGGTTTGG ACTTTGAGCA AGTAGATGTC TACAAAATCC GCGTTGACGC GACGGACAAA 
GGACACCCTC CGATGGCAGG CCATTGCACT GTTTTAGTGA GGGTATTGGA TGAAAACGAC 



60 
120 
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AATGCGCCTC T 131 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Lys Gly Leu Asp Phe Glu Gin Val Asp Val Tyr Lys lie Arg Val Asp 
1 5 - 10 15 

Ala Thr Asp Lys Gly His Pro Pro Met Ala Gly His Cys Thr Val Leu 
20 25 30 

Val Arg Val Leu Asp Glu Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 134 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
AAGGGTATAG ACTTCGAGCA GATCAAGGAC TTCAGCTTTC AAGTGGAAGC CCGGGACGCC 60 
GGCAGTCCCC AGGCGCTGTC CGGCAACTGC ACTGTCAACA TCTTGATAGT GGATCAGAAC 120 
GACAACGCCC CTAA 134 
(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

Lys Gly He Asp Phe Glu Gin He Lys Asp Phe Ser Phe Gin Val Glu 
1 5 io 15 
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Ala Arg Asp Ala Gly Ser Pro Gin 
20 



Ala Leu Ala Gly Asn Thr Thr Val 
25 30 



Asn lie Leu He Val Asp Gin Asn 
35 40 



Asp Asn Ala Pro 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 134 base pairs 

( B ) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
AAGCCGTTCG ACTATGAGCA AACCGCCAAC ACGCTGGCAC AGATTGACGC CGTGCTGGAA 60 
AAACAGGGCA GCAATAAATC GAGCATTCTG GATGCCACCA TTTTCCTGGC CGATAAAAAC 120 
GACAATGCGC CAGA 134 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Lys Pro Phe Asp Tyr Glu Gin Thr Ala Asn Thr Leu Ala Gin He Asp 
1 5 10 15 

Ala Val Leu Glu Lys Gin Gly Ser Asn Lys Ser Ser He Leu Asp Ala 
20 25 30 

Thr He Phe Leu Ala Asp Lys Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 
AAGCGGCTGG ATTTCGAACA GTTCCAG CAG CACAAGCTGC TCGTAAGGGC TGTTGATGGA 
GGAATGCCGC CACTGAGCAG CGATGTGGTC GTCACTGTGG ATGTCACCGA CCTCAACGAT 
AACGCGCCCT A 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

Lys Arg Leu Asp Phe Glu Gin Phe Gin Gin His Lys Leu Leu Val Arg 
15 10 is 

Ala Val Asp Gly Gly Met Pro Pro Leu Ser Ser Asp Val Val Val Thr 
20 25 30 

Val Asp Val Thr Asp Leu Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
AAGGGGATAG ACTTTGAGAG TGAGAATTAC TATGAATTTG ATGTGCGGGC TCGCGATGGG 
GGTTCTCCAG CCATGGAGCA ACATTGCAGC CTTCGAGTGG ATCTGCTGGA CGTAAATGAC 
AACGCCCCAC T 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



60 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



Lys Gly He Asp Phe Glu Ser Glu Asn Tyr Tyr Glu Phe Asp Val Arg 



15 

Ala Arg Asp Gly Gly Ser Pro Ala Met Glu Gin His Cys Ser Leu Arg 
20 25 30 

Val Asp Leu Leu Asp Val Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
AAGGCATTGG ACTTTGAGGC CCGGCGACTG TATTCGCTGA CACTTCAGGC CACGGACCGA 60 
GGCGTGCCCT CGCTCACCGG GCGTGCCGAA GCGCTTATCC AGCTGCTAGA TGTCAACGAC 120 
AACGCACCCA T 

131 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Lys Ala Leu Asp Phe Glu Ala Arg Arg Leu Tyr Ser Leu Thr Val Gin 
1 . 5 io 15 

Ala Thr Asp Arg Gly Val Pro Ser Leu Thr Gly Arg Ala Glu Ala Leu 
20 25 30 

He Gin Leu Leu Asp Val Asn Asp Asn Ala Pro 
3S 40 
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(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



125 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 
AAGCCAATTG ATTACGAGGC AACTCCATAC TATAACATGG AAATTGTAGC CAC AG AC AG C 60 
GGAGGTCTTT CGGGAAAATG CACTGTGTCT ATACAGGTGG TGGATGTGAA CGACAACGCC 120 

CCCAA 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Lys Pro He Asp Tyr Glu Ala Thr Pro Tyr Tyr Asn Met Glu He Val 
1 5 10 is 

Ala Thr Asp Ser Gly Gly Leu Ser Gly Lys Cys Thr Val Ser He Gin 
20 25 30 

Val Val Asp Val Asn Asp Asn Ala Pro 
35 4 o 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 446 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 
AAGCGGGTAG ACTTCGAAAT GTGCAAAAGA TTTTACCTTG TGGTGGAAGC TAAAGACGGA 
GGCACCCCAG CCCTCAGCAC GGCAGCCACT GTCAGCATCG ACCTCACAGA TGTGAATGAT 
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AACCCTCCTC GGTTCAGCCA AGATGTCTAC AGTGCTGTCA TCAGTGAGGA TGCCTTAGAG 180 

GGGGACTCTG TCATTCTGCT GATAGCAGAA GATGTGGATA GCAAGCCTAA TGGACAGATT 240 

CGGTTTTCCA TCGTGGGTGG AGATAGGGAC AATGAATTTG CTGTCGATCC AATCTTGGGA 300 

CTTGTGAAAG TTAAGAAGAA ACTGGACCGG GAGCGGGTGT CAGGATACTC CCTGCTCATC 360 

CAGGCAGTAG AT AG TGG CAT TCCTGCAATG TCCTCAACGA CAACTGTCAA CATTGATATT 420 

TCTGATGTGA ACGACAACGC CCCCCT 445 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 148 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 

Lys Arg Val Asp Phe Glu Met Cys Lys Arg Phe Tyr Leu Val Val Glu 
1 5 10 15 

Ala Lys Asp Gly Gly Thr Pro Ala Leu Ser Thr Ala Ala Thr Val Ser 
20 25 30 

He Asp Leu Thr Asp Val Asn Asp Asn Pro Pro Arg Phe Ser Gin Asp 
35 40 45 

Val Tyr Asp Ala Val He Ser Glu Asp Ala Leu Glu Gly Asp Ser Val 
50 55 60 

He Leu Leu He Ala Glu Asp Val Asp Ser Lys Pro Asn Gly Gin He 
65 70 75 80 

Arg Phe Ser He Val Gly Gly Asp Arg Asp Asn Glu Phe Ala Val Asp 
85 90 95 

Pro He Leu Gly Leu Val Lys Val Lys Lys Lys Leu Asp Arg Glu Arg 
100 105 HO 

Val Ser Gly Tyr Ser Leu Leu He Gin Ala Val Asp Ser Gly He Pro 
115 120 125 

Ala Met Ser Ser Thr Thr Thr Val Asn He Asp He Ser Asp Val Asn 
130 135 140 

Asp Asn Ala Pro 
145 
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(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 440 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 
AAGGGGGTTG ATTATGAGAC AAACCCACGG CTACGACTGG TGCTACAGGC AGAGAGTGGA 
GGAGCCTTTG CTTTCTCGGT GCTGACCCTG ACCCTTCAAG ATGCCAATGA CAATGCTCCC 
CGTTTCCTGC AGCCTCACTA CGTGGCTTTC CTGCCAGAGT CCCGACCCTT GGAAGGGCCC 
CTGCTGCAGG TGGAAGCAGA CGACCTGGAT CAAGGCTCTG GAGGACAGAT CTCCTACAGT 
CTGGCTGCAT CCCAGCCAGC ACGGGGCTTG TTCCATGTAG ACCCAGCCAC AGGCACTATC 

ACTACCACAG CCATCCTGGA CCGGGAAATC TGGGCTGAAA CACGGCTGGT ACTGATGGCC 

ACAGACAGAG GAAGCCCAGC ATTGGTGGGC TCAGCTACCC TGACAGTGAT GGTCATCGAT 

ACCAACGACA ATGCTCCCCT 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 146 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Lys Gly Val Asp Tyr Glu Thr Asn Pro Arg Leu Arg Leu Val Leu Gin 

Ala Glu Ser Gly Gly Ala Phe Ala Phe Ser Val Leu Thr Leu Thr Leu 
20 25 30 

Gin Asp Ala Asn Asp Asn Ala Pro Arg Phe Leu Gin Pro His Tyr Val 
3 40 45 

Ala Phe Leu Pro Glu Ser Arg Pro Leu Glu Gly Pro Leu Leu Gin Val 

55 60 

Glu Ala Asn Asp Leu Asp Gin Gly Ser Gly Gly Gin lie Ser Tyr Ser 

70 75 80 

Leu Ala Ala Ser Gin Pro Ala Arg Gly Leu Phe His Val Asp Pro Ala 
85 90 95 
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Thr Gly Thr He Thr Thr Thr Ala He Leu Asp Arg Glu He Trp Ala 
100 105 HO 

Glu Thr Arg Leu Val Leu Met Ala Thr Asp Arg Gly Ser Pro Ala Leu 
115 120 125 

Val Gly Ser Ala Thr Leu Thr Val Met Val He Ab P Thr Asn Asp Asn 
130 135 140 

Ala Pro 
145 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 124 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
AAGGTCTCGA TTATGAGGCA ACTCCATATT ATAACGTGCA AATTGTAGCC ACAGATGGTG 60 
GGGGCCTTTC AGGAAAATGC ACTGTGGCTA TAGAAGTGGT GGATGTGAAC GACGGCGCTC 120 

CAAT 124 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Lys Gly Leu Asp Tyr Glu Ala Thr Pro Tyr Tyr Asn Val Glu He Val 
15 10 15 

Ala Thr Asp Gly Gly Ala Phe Asp Glu Asn Cys Thr Val Ala He Glu 
20 25 30 

Val Val Asp Val Asn Asp Asn Ala Pro 
35 40 
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(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 

Asp Xaa Asn Glu Xaa Pro Xaa Phe 
1 5 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Asp Xaa Asp Glu Xaa Pro Xaa Phe 
1 S 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

Asp Xaa Asn Asp Asn Xaa Pro Xaa Phe 
1 5 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
AAGCGGATGG ATTTTGAAGA CACCAAACTC CATGAGATTT ACATCCACGC CAAAGACAAA 60 
GGTGCCAATC CCGAAGGAGC GCATTGCAAA GTACTTGTAG AGGTTGTAGA CGTAAACGAC 120 
AACGCCCCAG T 131 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Leu Arg Met Asp Phe Glu Asp Thr Lys Leu His Glu lie Tyr lie Gin 
1 5 10 15 

Ala Lys Asp Lys Gly Ala Asn Pro Glu Gly Ala His Cys Lys Val Leu 
20 25 30 

Val Glu Val Val Asp Val Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
AAGGCTTTGG ATTACGAGGA TCAGAGAGAG TTCCAACTAA CAGCTCATAT AAACGACGGA 60 
GGTACCCCAG TCTTAGCCAC CAACATCAGC GTGAACGTAT TTGTTACTGA CCGCAATGAT 120 
AACGCCCCCT A 131 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:47: 

Lys Ala Leu Asp Tyr Glu Asp Gin Arg Glu Phe Gin Leu Thr Ala His 
15 10 15 

lie Asn Asp Gly Gly Thr Pro Val Leu Ala Thr Asn lie Ser Val Asn 
20 25 30 

Val Phe Val Thr Asp Arg Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 
AAGCGCTTGG ACTACGAGGA GAG TAACAAT TATGAAATTC ACGTGGATGC TACAGATAAA 60 
GGATACCCAC CTATGGTTGC TCACTGCACC GTACTCGTGG GAATCTTGGA TGAAAATGAC 120 
AACGCACCCA T 131 
(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

Lys Arg Leu Asp Tyr Glu Glu Ser Asn Asn Tyr Glu lie His Val Asp 
15 10 15 

Ala Thr Asp Lys Gly Tyr Pro Pro Met Val Ala His Cys Thr Val Leu 
20 25 30 

Val Gly He Leu Asp Glu Asn Asp Asn Ala Pro 
35 40 
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(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
AAACCGGTGG ACTACGAGAA AGTCAAAGAC TATACCATCG AGATCCTGGC TGTGGATTCC 60 
GGCAACCCTC CACTCTCTAG CACCAACTCC CTCAAGGTGC AGGTGGTAGA CGTCAACGAT 120 
AACGCCCCTC T 131 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: 

Lys Pro Val Asp Tyr Glu Lys Val Lys Asp Tyr Thr lie Glu lie Val 
15 10 15 

Ala Val Asp Ser Gly Asn Pro Pro Leu Ser Ser Thr Asn Ser Leu Lys 
20 25 30 

Val Gin Val Val Asp Val Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:52: 
AAGCCTTTTG ATTTCGAGGA CACCAAACTC CATGAGATTT ACATCCAGGC CAAAGACAAG 
GGCGCCAATC CCGAAGGAGC ACATTGCAAA GTGTTGGTGG AGGTTGTGGA TGTGAACGAC 
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AATGCCCCTC A 131 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Lys Pro Phe Asp Phe Glu Asp Thr Lys Leu His Glu lie Tyr lie Gin 
15 10 15 

Ala Lys Asp Lys Gly Ala Asn Pro Glu Gly Ala His Cys Lys Val Leu 
20 25 30 

Val Glu Val Val Asp Val Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 122 base pairB 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
AAAGGTGTCG ATTACGAGGT GAGTCCACGG CTGCGACTGG TGCTGCAGGC AGAGAGTCGA 60 
GGAGCCTTTG CCTTCACTGT GCTGACCCTG ACCCTGCAAG ATGCCAACGA CAACGCCCCG 120 
AG 122 
(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Lys Gly Val Asp Tyr Glu Val Ser Pro Arg Leu Arg Leu Val Leu Gin 
15 10 15 
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Ala Glu Ser Arg Gly Ala Phe Ala Phe Thr Val Leu Thr Leu Thr Leu 
20 25 30 

Gin Asp Ala Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
AAAGGGATTG ATTACGAGCA GTTGAGAGAC CTACAGCTGT GGGTGACAGC CAGCGACAGC 60 
GGGGACCCGC CTCTTAGCAG CAACGTGTCA CTGAGCCTGT TTGTGCTGGA CCAGAACGAC 120 
AACGCCCCCC T 131 
(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

( B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Lys Gly lie Asp Tyr Glu Gin Leu Arg Asp Leu Gin Leu Trp Val Thr 
1 5 10 15 

Ala Ser Asp Ser Gly Asp Pro Pro Leu Ser Ser Asn Val Ser Leu Ser 
20 25 30 

Leu Phe Val Leu Asp Gin Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
AAGGCGGTCG ATTTTGAGCG CACATCCTCT TATCAACTCA TCATTCAGGC CACCAATATG 60 
GCAGG AATGG CTTCCAATGC TACAGTCAAT ATTCAGATTG TTGATGAAAA CGACAACGCC 120 
CCCCA 125 
(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Lys Ala Val Asp Phe Glu Arg Thr Ser Ser Tyr Gin Leu lie lie Gin 
15 10 15 

Ala Thr Asn Met Ala Gly Met Ala Ser Asn Ala Thr Val Asn lie Gin 
20 25 30 

lie Val Asp Glu Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
AAACGGCTAG ACTTTGAAAA GATACAAAAA TATGTTGTAT GGATAGAGGC CAGAGATGGT 60 
GGTTTCCCTC CTTTCTCCTC TTACGAGAAA CTTGATATAA CAGTATTAGA TGTCAACGAT 120 
AACGCGCCTA A 131 
(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Lys Arg Leu Asp Phe Glu Lys He Gin Lys Tyr Val Val Trp He Glu 
15 10 15 

Ala Arg Asp Gly Gly Phe Pro Pro Phe Ser Ser Tyr Glu Lys Leu Asd 
20 25 30 * 

He Thr Val Leu Asp Val Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
AAGGGGATCG ATTATGAGAA GGTCAAAGAC TACACCATTG AGATTGTGGC TGTGGACTCT 60 
GGCAACCCCC CACTCTCCAG CACTAACTCC CTCAAGGTGC AGGTGGTGGA CGTCAATGAC 120 

AACGCACCGT G 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Lys Gly He Asp Tyr Glu Lys Val Lys Asp Tyr Thr He Glu He Val 
15 10 15 

Ala Val Asp Ser Gly Asn Pro Pro Leu Ser Ser Thr Asn Ser Leu Lys 
20 25 30 

Val Gin Val Val Asp Val Asn Asp Asn Ala Pro 
35 40 
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(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) ST RAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: 
AAGGGACTCG ACTACGAGGA TCGGCGGGAA TTTGAATTAA CAGCTCATAT CAGCGATGGG 60 
GGCACCCCGG TCCTAGCCAC CAACATCAGC GTGAACATAT TTGTCACTGA TCGCAACGAT 120 
AATGCCCCCG T 131 
(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Lys Gly Leu Asp Tyr Glu Asp Arg Arg Glu Phe Glu Leu Thr Ala His 
1 5 10 15 

He Ser Asp Gly Gly Thr Pro Val Leu Ala Thr Asn He Ser Val Asn 
20 25 30 

He Phe Val Thr Asp Arg Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
AAGGGTTTGG ACTACGAGAC CACACAGGCC TACCAGCTCA CGGTCAACGC CACAGATCAA 
GACAACACCA GGCCTCTGTC CACCCTGGCC AACTTGGCCA TCATCATCAC AGATGTCCAG 
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GACATGGACC CCATCTTCAT CAACCTGCCT TACAG CACCA ACATCTACGA GCATTCTCCT 180 

CCGGGCACGA CGGTGCGCAT CATCACCGCC ATAGACCAGG ATCAAGGACG TCCCCGGGGC 240 

ATTGGCTACA CCATCGTTTC AGGGAATACC AACAGCATCT TTGCCCTGGA CTACATCAGC 300 

GGAGTGCTGA CCTTGAATGG CCTGCTGGAC CGGGAGAACC CCCTGTACAG CCATGGCTTC 360 

ATCCTGACTG TGAAGGGCAC GGAGCTGAAC GATGACCGCA CCCCATCTGA CGCTACAGTC 420 

ACCACGACCT TCAATATCCT GGTTATTGAC ATCAACGACA ACGCCCCACT 470 

<2> INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 156 amino acids 
(8) TYPE: amino acid 
(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Lys Gly Leu Asp Tyr Glu Thr Thr Gin Ala Tyr Gin Leu Thr Val Asn 
15 10 15 

Ala Thr Asp Gin Asp Asn Thr Arg Pro Leu Ser Thr Leu Ala Asn Leu 
20 25 30 

Ala lie lie lie Thr Asp Val Gin Asp Met Asp Pro lie Phe lie Asn 
35 40 45 

Leu Pro Tyr Ser Thr Asn lie Tyr Glu His Ser Pro Pro Gly Thr Thr 
50 55 60 

Val Arg lie lie Thr Ala lie Asp Gin Asp Gin Gly Arg Pro Arg Gly 
65 70 75 80 

He Gly Tyr Thr He Val Ser Gly Asn Thr Asn Ser He Phe Ala Leu 
85 90 95 

Asp Tyr He Ser Gly Val Leu Thr Leu Asn Gly Leu Leu Asp Arg Glu 
100 105 110 

Asn Pro Leu Tyr Ser Gly Gly Phe He Leu Thr Val Lys Gly Thr Glu 
115 120 125 

Leu Asn Asp Asp Arg Thr Pro Ser Asp Ala Thr Val Thr Thr Thr Phe 
130 135 140 

Asn He Leu Val He Asp He Asn Asp Asn Ala Pro 
145 150 155 
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(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
AAGGGGGTCG ATTACGAGGT ACTACAGGCC TTTGAGTTCC ACGTGAGCGC CACAGACCGA 60 
GGCTCACCGG GGCTCAGCAG CCAGGCTCTG GTGCCCCTCG TGGTGCTGGA CGACAATGAC 120 
AACGCTCCCG T 131 
(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Lys Gly Val Asp Tyr Glu Val Leu Gin Ala Phe Glu Phe His Val Ser 
1 5 10 15 

Ala Thr Asp Arg Gly Ser Pro Gly Leu Ser Ser Gin Ala Leu Val Arg 
20 25 30 

Val Val Val Leu Asp Asp Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:70: 
AAGGGGCTGG ATTATGAGCA GTTCCAGACC CTACAACTGG GAGTGACCGC TAGTGACAGT 
GGAAACCCAC CATTAAGAAG CAATATTTCA CTGACCCTTT TCGTGCTGGA CCAGAATGAT 
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AACGCCCCAA A 131 
(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:71: 

Lys Gly Leu Asp Tyr Glu Gin Phe Gin Thr Leu Gin Leu Gly Val Thr 
15 10 15 

Ala Ser Asp Ser Gly Asn Pro Pro Leu Arg Ser Asn lie Ser Leu Thr 
20 25 30 

Leu Phe Val Leu Asp Gin Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
AAGCGGGTTG ATTACGAGGA TGTCCAGAAA TACTCGCTGA GCATTAAGGC CCAGGATGGG 60 
CGGCCCCCGC TCATCAATTC TTCAGGGGTC GTGTCTGTGC AGGTGCTGGA TGTCAACGAC 120 
AATGCCCCGG A 131 
(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Lys Arg Val Asp Tyr Glu Asp Val Gin Lys Tyr Ser Leu Ser He Lys 
1 5 io 15 





- 63 - 



Ala Gin Asp Gly Arg Pro Pro Leu lie Asn Ser Ser Gly Val Val Ser 
20 25 30 



Val Gin Val Leu Asp Val Asn Asp Asn Ala Pro 
35 40 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 base pairs 

(B) TYPE: nucleic acid 

(C) ST RAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
AAACCGGTAG ACTTTGAGCT ACAGCAGTTC TATGAAGTAG CTGTGGTGGC TTGGAACTCT 60 
GAGGGATTTC ATGTCAAAAG GGTCATTAAA GTGCAACTTT TAGATGACAA CGACAATGCC 120 
CCGAT 12 5 
(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Lys Pro Val Asp Phe Glu Leu Gin Gin Phe Tyr Glu Val Ala Val Val 
1 5 io 15 

Ala Trp Asn Ser Glu Gly Phe His Val Lys Arg Val He Lys Val Gin 
20 25 30 

Leu Leu Asp Asp Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:76x 
AAGGGATTAG ATTTTGAAAC TTTGCCCATT TACACATTGA TAATACAAGG AACTAACATG 60 
GCTGGTTTGT CCACTAATAC AACGGTTCTA GTTCACTTGC AGGATGAGAA TGATAACGCC 120 
CCAAA 125 
(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

Lys Gly Leu Asp Phe Glu Thr Leu Pro He Tyr Thr Leu He He Gin 
1 5 10 15 

Gly Thr Asn Met Ala Gly Leu Ser Thr Asn Thr Thr Val Leu Val His 
20 25 30 

Leu Gin Asp Glu Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 134 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 
AAGCGGGCGG ATTTCGAGGC GATCCGGGAG TACAGTCTGA GGATCAAAGC GCAGGACGGG 60 
GGGCGGCCTC CCCTCAGCAA CACCACGGGC ATGGTCACAG TGCAGGTCGT GGACGTCAAT 120 
GACAACGCAC CCCT 134 
(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Lys Arg Ala Asp Phe Glu Ala lie Arg Glu Tyr Ser Leu Arg lie Lys 
15 10 15 

Ala Gin Asp Gly Gly Arg Pro Pro Leu Ser Asn Thr Thr Gly Met Val 
20 25 30 

Thr Val Gin Val Val Asp Val Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:80: 
AAGCGGTTGG ATTACGAAAA GGCATCGGAA TATGAAATCT ATGTTCAAGC CGCTGACAAA 60 
GGCGCTGTCC CTATGGCTGG CCATTGCAAA GTGTTGCTGG AGATCGTGGA TGTCAACGAC 120 
AACGCCCCCT T X31 
(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:81: 

Lys Arg Leu Asp Tyr Glu Lys Ala Ser Glu Tyr Clu He Tyr Val Gin 
1 5 10 15 

Ala Ala Asp Lys Gly Ala Val Pro Met Ala Gly His Cys Lys Val Leu 
20 25 30 

Leu Glu He Val Asp Val Asn Asp Asn Ala Pro 
35 40 
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(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
AAGGGGATCG ATTATGAGGA TCAGGTCTCT TACACATTAG CAGTAACAGC ACATGACTAT 60 
GGCATCCCTC AAAAATCAGA CACTACCTAT TTGGAAATCT TAG TAATTG A TGTTAACGAC 120 
AACGCGCCCC A 131 
(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

Lys Gly He Asp Tyr Glu Asp Gin Val Ser Tyr Thr Leu Ala Val Thr 
15 10 15 

Ala His Asp Tyr Gly He Pro Gin Lys Ser Asp Thr Thr Tyr Leu Glu 
20 25 30 

He Leu Val He Asp Val Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:84: 
AAAGGGTTAG ATTTCGAGGG CACTAAAGAT TCAGCGTTTA AAATAGTGGC AGCTGACACA 
GGGAAGCCCA GCCTCAACCA GACAGCCCTG GTGAGAGTAG AGCTGGAGGA TGAGAACGAC 



60 
120 
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AACGCCCCAA T 131 
(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:85: 

Lys Gly Leu Asp Phe Glu Gly Thr Lys Asp Ser Ala Phe Lys lie Val 
1 5 10 15 

Ala Ala Asp Thr Gly Lys Pro Ser Leu Asn Gin Thr Ala Leu Val Arg 
20 25 30 

Val Glu Leu Glu Asp Glu Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 130 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:86: 

AAGGGTGTGG ATTTTGAAAG TGTGCGTAGC TACAGGCTGG TTATTCGTGC TCAAGATGGA 60 

GGCAGCCCCT CCAGAAGTAA CACCACCCAG CTCTTGGTCA ACGTCATCGA TCGAATGACA 120 

ATGCGCCGCT 1 3 q 
(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Lys Gly Val Asp Phe Glu Ser Val Arg Ser Tyr Arg Leu Val lie Arg 
1 5 in i c 
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Ala Gin Asp Gly Gly Ser Pro Ser Arg Ser Asn Thr Thr Gin Leu Leu 
20 25 30 

Val Asn Val lie Asp Val Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 
AAGGGTGTGG ACTTCGAGCT GACACATCTG TATGAGATTT GGATTGAGGC TGCCGATGGA 60 
GACACGCCAA GTCTGCGTAG TGTAACTCTT ATAACGCTCA ACGTAACGGA TGCCAATGAC 120 
AATGCTCCCA A 131 
(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Lys Gly Val Asp Phe Glu Leu Thr His Leu Tyr Glu He Trp He Glu 
15 10 15 

Ala Ala Asp Gly Asp Thr Pro Ser Leu Arg Ser Val Thr Leu He Thr 
20 25 30 

Leu Asn Val Thr Asp Ala Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 441 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

CAAGGCGTTT GATTTTGAAG AGACAAGTAG ATATGTCTTG AGTGTGGAAG CTAAGGATGG 60 

AGGAGTACAC ACAGCTCACT GTAATGTTCA AATAGAAATT GTTGACGAGA ATGACAATGC 120 

CCCAGAGGTG ACATTCATGT CCTTCTCTAA CCAGATTCCA GAGGATTCAG ACCTTGGAAC 180 

TGTAATAGCC CTCATAAAAG TGCGAGACAA GGATTCTGGG CAAAATGGCA TGGTGACATG 240 

CTATACTCAG GAAGAAGTTC CTTTCAAATT AGAATCCACC TCGAAGAATT ATTACAAGCT 300 

GGTGATTGCT GGAGCCCTAA ACCGGGAGCA GACAGCAGAC TACAACGTCA CAATCATAGC 360 

CACCGACAAG GGCAAACCAG CCCTTTCCTC CAGGACAAGC ATCACCCTGC ACATCTCCGA 420 

CATCAACGAT AATGCCCCCG T 441 
(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 146 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Lys Ala Phe Asp Phe Glu Glu Thr Ser Arg Tyr Val Leu Ser Val Glu 
15 10 15 

Ala Lys Asp Gly Gly Val His Thr Ala His Cys Asn Val Gin lie Glu 
20 25 30 

lie Val Asp Glu Asn Asp Asn Ala Pro Glu Val Thr Phe Met Ser Phe 
35 40 45 

Ser Asn Gin lie Pro Glu Asp Ser Asp Leu Gly Thr Val lie Ala Leu 
50 55 60 

lie Lys Val Arg Asp Lys Asp Ser Gly Gin Asn Gly Met Val Thr Cys 
65 70 75 80 

Tyr Thr Gin Glu Glu Val Pro Phe Lys Leu Glu Ser Thr Ser Lys Asn 
85 90 95 

Tyr Tyr Lys Leu Val lie Ala Gly Ala Leu Asn Arg Glu Gin Thr Ala 
100 105 110 

Asp Tyr Asn Val Thr lie lie Ala Thr Asp Lys Gly Lys Pro Ala Leu 
115 120 125 

Ser Ser Arg Thr Ser lie Thr Leu His lie Ser Asp lie Asn Asp Asn 
130 135 140 

Ala Pro 
145 
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(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
AAGCGAGTGG ATTACGAGGC CACTCGGAAT TATAAGCTGA GAGTTAAGGC TACTGATCTT 
GGGATTCCAC CGAGATCTTC TAACATGACA CTGTTCATTC ATGTC CTTG A TGTTAACGAC 
AACGCTCCCT T 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

Lys Arg Val Asp Tyr Glu Ala Thr Arg Asn Tyr Lys Leu Arg Val Lys 
15 10 15 

Ala Thr Asp Leu Cly He Pro Pro Arg Ser Ser Asn Met Thr Leu Phe 
20 25 30 

He His Val Leu Asp Val Asn Asp Asn Ala Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4104 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 495.. 3572 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:94: 



60 
120 
131 



-71 - 



CCTCTATTCG ACATTCTCTT TGGATTGTTT TGCTATAACT TGAAATTTGG GATGTCACAA 60 

ACGAAACTGT CATCTGTTTC CGCCAAACTG TGGTTCTGCT AATCTCCCAG GCTGGCAGCA 120 

TTGGAGACTT GCTGACTTCT TTCATCCCCC ACTCTTTTCA CCTGAAATTC CTTTCCTTGG 180 

TTTTGCTCTA AGTCCTATGC TTCAGTCAGG GGCCAACCAA ATCTCACTGC CTCCTTTTTA 240 

TCATGAAGCC TTTGATCACT GATAGTTCTT TTTATATCTT GAAAAATCAC CCTTCCCAGT 300 

ACAGTTAATA TTTAGTATCT CTACTCATCT TGGCACTTAC TCACAGCTCC ATAATTCACT 360 

CGTTTTCGTA CCTCTTCATG GTGATGGGGA GCCCTTTGGA GGTGGTGACT G TGCTTT AT A 420 

CTCCTCATGA TGCTTCACAT GTGGCAGGCG TGGAGTGCCC GGAGGCGGCC CTCCTGATTC 480 

TGGGGCCTCC CAGG ATG GAG CCC CTG AGG CAC AGC CCA GGC CCT GGG GGG 530 
Met Glu Pro Leu Arg Hie Ser Pro Gly Pro Gly Gly 
1 5 10 

CAA CGG CTA CTG CTG CCC TCC ATG CTG CTA GCA CTG CTG CTC CTG CTG 578 
Gin Arg Leu Leu Leu Pro Ser Met Leu Leu Ala Leu Leu Leu Leu Leu 
15 20 25 

GCT CCA TCC CCA GGC CAC GCC ACT CGG GTA GTG TAC AAG GTG CCG GAG 626 
Ala Pro Ser Pro Gly His Ala Thr Arg Val Val Tyr Lys Val Pro Glu 
30 35 40 

GAA CAG CCA CCC AAC ACC CTC ATT GGG AGC CTC GCA GCC GAC TAT GGT 674 
Glu Gin Pro Pro Asn Thr Leu lie Gly Ser Leu Ala Ala Asp Tyr Gly 
45 50 55 60 

TTT CCA GAT GTG GGG CAC CTG TAC AAG CTA GAG GTG GGT GCC CCG TAC 722 
Phe Pro Asp Val Gly His Leu Tyr Lys Leu Glu Val Gly Ala Pro Tyr 
65 70 75 

CTT CGC GTG GAT GGC AAG ACA GGT GAC ATT TTC ACC ACC GAG ACC TCC 770 
Leu Arg Val Asp Gly Lys Thr Gly Asp lie Phe Thr Thr Glu Thr Ser 
80 85 90 

ATC GAC CGT GAG GGG CTC CGT GAA TGC CAG AAC CAG CTC CCT GGT GAT 818 
lie Asp Arg Glu Gly Leu Arg Glu Cys Gin Asn Gin Leu Pro Gly Asp 
95 100 105 

CCC TGC ATC CTG GAG TTT GAG GTA TCT ATC ACA GAC CTC GTG CAC AAT 866 
Pro Cys He Leu Glu Phe Glu Val Ser He Thr Asp Leu Val Gin Asn 
HO us 120 

GCG AGC CCC CGG CTG CTA GAG GGC CAG ATA GAA GTA CAA GAC ATC AAT 914 
Ala Ser Pro Arg Leu Leu Glu Gly Gin He Glu Val Gin Asp He Asn 
125 130 135 140 

GAC AAC ACA CCC AAC TTC GCC TCA CCA GTC ATC ACT CTG GCC ATC CCT 962 
Asp Asn Thr Pro Asn Phe Ala Ser Pro Val He Thr Leu Ala He Pro 
145 iso 155 

GAG AAC ACC AAC ATC GGC TCA CTC TTC CCC ATC CCG CTG GCT TCA GAC 1010 
Glu Asn Thr Asn He Gly Ser Leu Phe Pro He Pro Leu Ala Ser Asp 
160 165 170 
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CGT GAT GCT GGT CCC AAC GGT GTG GCA TCC TAT GAG CTG CAG GTG GCA 1058 
Arg Asp Ala Gly Pro Aen Gly Val Ala Ser Tyr Glu Leu Gin Val Ala 
175 180 185 

GAG GAC CAG GAG GAG AAG CAA CCA CAG CTC ATT GTG ATG GGC AAC CTG 1106 
Glu Asp Gin Glu Glu Lys Gin Pro Gin Leu lie Val Met Gly Asn Leu 
190 195 200 

GAC CGT GAG CGC TGG GAC TCC TAT GAC CTC ACC ATC AAG GTG CAG GAT 1154 
Asp Arg Glu Arg Trp Asp Ser Tyr Asp Leu Thr lie Lys Val Gin Asp 
205 210 215 220 

GGC GGC AGC CCC CCA CGC GCC ACG AGT GCC CTG CTG CGT GTC ACC GTG 1202 
Gly Gly Ser Pro Pro Arg Ala Thr Ser Ala Leu Leu Arg Val Thr Val 
225 230 235 

CTT GAC ACC AAT GAC AAC GCC CCC AAG TTT GAG CGG CCC TCC TAT GAG 1250 
Leu Asp Thr Asn Asp Asn Ala Pro Lys Phe Glu Arg Pro Ser Tyr Glu 
240 245 250 

GCC GAA CTA TCT GAG AAT AGC CCC ATA GGC CAC TCG GTC ATC CAG GTG 1298 
Ala Glu Leu Ser Glu Asn Ser Pro lie Gly His Ser Val lie Gin Val 
255 260 265 

AAG GCC AAT GAC TCA GAC CAA GGT GCC AAT GCA GAA ATC GAA TAC ACA 1346 
Lys Ala Asn Asp Ser Asp Gin Gly Ala Asn Ala Glu lie Glu Tyr Thr 
270 275 280 

TTC CAC CAG GCG CCC GAA GTT GTG AGG CGT CTT CTT CGA CTG GAC AGG 1394 
Phe His Gin Ala Pro Glu Val Val Arg Arg Leu Leu Arg Leu Asp Arg 
285 290 295 300 

AAC ACT GGA CTT ATC ACT GTT CAG GGC CCG GTG CAC CGT GAG GAC CTA 1442 
Asn Thr Gly Leu He Thr Val Gin Gly Pro Val Asp Arg Glu Asp Leu 
305 310 315 

AGC ACC CTG CGC TTC TCA GTG CTT GCT AAG GAC CGA GGC ACC AAC CCC 1490 
Ser Thr Leu Arg Phe Ser Val Leu Ala Lys Asp Arg Gly Thr Asn Pro 
320 325 330 

AAG AGT GCC CGT GCC CAG GTG GTT GTG ACC GTG AAG GAC ATG AAT GAC 1538 
Lys Ser Ala Arg Ala Gin Val Val Val Thr Val Lys Asp Met Asn Asp 
335 340 345 

AAT GCC CCC ACC ATT GAG ATC CGG GGC ATA GGG CTA GTG ACT CAT CAA 1586 
Asn Ala Pro Thr He Glu He Arg Gly He Gly Leu Val Thr His Gin 
350 355 360 

GAT GGG ATG GCT AAC ATC TCA GAG GAT GTG GCA GAG GAG ACA GCT GTG 1634 
Asp Gly Met Ala Asn He Ser Glu Asp Val Ala Glu Glu Thr Ala Val 
365 370 375 380 

GCC CTG GTG CAG GTG TCT GAC CGA GAT GAG GGA GAG AAT GCA GCT CTC 1682 
Ala Leu Val Gin Val Ser Asp Arg Asp Glu Gly Glu Asn Ala Ala Val 
385 390 395 

ACC TGT GTG GTG GCA GGT GAT GTG CCC TTC CAG CTG CGC CAG GCC AGT 1730 
Thr Cys Val Val Ala Gly Asp Val Pro Phe Gin Leu Arg Gin Ala Ser 
400 405 410 
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GAG ACA GGC AGT GAC AGC AAG AAG AAG TAT TTC CTG CAG ACT ACC ACC 1778 
Glu Thr Gly Ser Asp Ser Lye Lye Lys Tyr Phe Leu Gin Thr Thr Thr 
415 420 425 

CCG CTA GAC TAC GAG AAG GTC AAA GAC TAC ACC ATT GAG ATT GTG GCT 1826 
Pro Leu Asp Tyr Glu Lys Val Lys Asp Tyr Thr lie Glu He Val Ala 
430 435 440 

GTG GAC TCT GGC AAC CCC CCA CTC TCC AGC ACT AAC TCC CTC AAG GTG 1874 
Val Asp Ser Gly Asn Pro Pro Leu Ser Ser Thr Asn Ser Leu Lye Val 
445 450 455 460 

CAG GTG GTG GAC GTC AAT GAC AAC GCA CCT GTC TTC ACT CAG AGT GTC 1922 
Gin Val Val Asp Val Asn Asp Asn Ala Pro Val Phe Thr Gin Ser Val 
465 470 475 

ACT GAG GTC GCC TTC CCG GAA AAC AAC AAG CCT GGT GAA GTC ATT GCT 1970 
Thr Glu Val Ala Phe Pro Glu Asn Asn Lys Pro Gly Glu Val He Ala 
480 485 490 

GAG ATC ACT GCC AGT GAT GCT GAC TCT GGC TCT AAT GCT GAG CTG GTT 2018 
Glu He Thr Ala Ser Asp Ala Asp Ser Gly Ser Asn Ala Glu Leu Val 
495 500 505 

TAC TCT CTG GAG CCT GAG CCG GCT GCT AAG GGC CTC TTC ACC ATC TCA 2066 
Tyr Ser Leu Glu Pro Glu Pro Ala Ala Lys Gly Leu Phe Thr He Ser 
510 515 520 

CCC GAG ACT GGA GAG ATC CAG GTG AAC ACA TCT CTG GAT CGG GAA CAG 2114 
Pro Glu Thr Gly Glu He Gin Val Lys Thr Ser Leu Asp Arg Glu Gin 
525 530 535 540 

CGG GAG AGC TAT GAG TTG AAG GTG GTG GCA GCT GAC CGG GGC AGT CCT 2162 
Arg Glu Ser Tyr Glu Leu Lys Val Val Ala Ala Asp Arg Gly Ser Pro 
545 550 555 

AGC CTC CAG GGC ACA GCC ACT GTC CTT GTC AAT GTG CTG GAC TGC AAT 2210 
Ser Leu Gin Gly Thr Ala Thr Val Leu Val Asn Val Leu Asp Cys Asn 
560 565 5? 5 

GAC AAT GAC CCC AAA TTT ATG CTG AGT GGC TAC AAC TTC TCA GTG ATG 22 58 

Asp Asn Asp Pro Lys Phe Met Leu Ser Gly Tyr Asn Phe Ser Val Met 
575 580 585 

GAG AAC ATG CCA GCA CTG AGT CCA GTG GGC ATG GTG ACT GTC ATT GAT 2306 
Glu Asn Met Pro Ala Leu Ser Pro Val Gly Met Val Thr Val He Asp 
590 595 600 

GGA GAC AAG GGG GAG AAT GCC CAG GTG CAG CTC TCA GTG GAG CAG GAC 2354 
Gly Asp Lys Gly Glu Asn Ala Gin Val Gin Leu Ser Val Glu Gin Asp 
605 610 615 620 

AAC GGT GAC TTT GTT ATC CAG AAT GGC ACA GGC ACC ATC CTA TCC AGC 2402 
Asn Gly Asp Phe Val He Gin Asn Gly Thr Gly Thr He Leu Ser Ser 
625 630 635 

CTG AGC TTT GAT CGA GAG CAA CAA AGC ACC TAC ACC TTC CAG CTG AAG 2450 
Leu Ser Phe Asp Arg Glu Gin Gin Ser Thr Tyr Thr Phe Gin Leu Lys 
640 645 650 
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GCA GTG GAT GOT GGC GTC CCA CCT CGC TCA CCT TAC GTT GGT CTC ACC 
Ala Val Asp Gly Gly Val Pro Pro Arg Ser Ala Tyr Val Gly 52 Thr 
os» 660 555 

ATC AAT GTG CTG GAC GAG AAT GAC AAC GCA CCC TAT ATC ACT GCC CCT 
lie Asn Val Leu Asp Clu Asn Asp Asn Ala Pro Tyr lit J£ Jg £o 
° /u 675 680 

TCT AAC ACC TCT CAC AAG CTG CTG ACC CCC CAG ACA CGT CTT GGT GAG 
Ser Asn Thr Ser His Lys Leu Leu Thr Pro Gin Thr i£g £J £J G?u 

690 69S 700 

ACG GTC ACC CAG CTC CCA GCC GAG GAC TTT GAC TCT GGT GTC AAT GCC 
Thr val ser Gin Val Ala Ala Clu Asp Phe Asp Ser §?J 52 j£J J£ 
/U3 710 

GAG CTG ATC TAC AGC ATT GCA GGT GGC AAC CCT TAT GGA CTC TTC CAC 
Glu Leu lie Tyr Ser lie Ala Gly Gly Asn Pro Tyr Gly 25 Git 
720 7 25 730 



ATT GGG TCA CAT TCA GGT GCC ATC ACC CTG GAG AAG GAG ATT GAG CCC 
He Gly ser His Ser Gly Ala lie Thr Leu Glu Lys tit J2 £S 

740 745 * 

CGC CAC CAT GGG CTA CAC CGC CTG GTG GTG AAG GTC ACT r*n rrr ^„ 
Arg His His Gly Leu His Arg Leu Val v2 £s 52 J£ J£ £J 

AAG CCC CCA CGC TAT GGC ACA GCC TTG GTC CAT CTT TAT ctp aat ^ 
Lys Pro Pro Arg Tyr Gly Thr Ala Leu VaJ SE S 52 gj 

775 780 

?S S SS SS SI Jg 2S S s: JS S S? His ts s 

785 79 ° 795 

2S JS S S £ J2 2? JS S2 S? £ s 2: £ SX f£ 

805 320 

Ser JJS SK SS £5 it** ?F CTC 7X7 GGT CTG «C CCT GTG GTG 

ser Lys Gin Arg Gly Asn He Leu Phe Gly Val Val Ala Gly Val Val 

820 825 

s 5 s = s s s 2 s en s c 2- g- - - 

ojd 840 

SS 2S SS £, c J£ J2 £J ™ «; « £ T r f* 0 °? c ACC — 

845 y o?n y y ln Ala G1 * L * e L V 8 Glu Thr Lys 

850 855 860 

Asp £u lyl 2a £0 £s £0 f * S° ^ ° CC TCC ™ <** ** C ™ 
* iyr Ala Pro Lys Pro Ser Gly Lys Ala Ser Lys Gly Asn Lys 

Ser Lys £s £s i£ 1°° ?* CCC GTG ** G C <* «~ CAG 

880 y r Pr ° Ly ° Pr ° Val L y fl Pro Va l Olu 

885 890 



2498 



2546 



2594 



2642 



2690 



2738 



2786 



2834 



2882 



2930 



2978 



3026 



3074 



3122 



3170 
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GAC GAG CAT GAG GCC GGG CTG CAG AAG TCC CTC AAG TTC AAC CTC ATG 3218 
Asp Clu Asp Glu Ala Gly Leu Gin Lys Ser Leu Lys Phe Asn Leu Met 
895 900 905 

AGC GAT GCC CCT GGG GAC ACT CCC CGC ATC CAC CTG CCC CTC AAC TAC 3266 
Ser Asp Ala Pro Gly Asp Ser Pro Arg lie His Leu Pro Leu Asn Tvr 
910 915 920 

CCA CCA GCC AGC CCT GAC CTG GGC CGC CAC TAT CGC TCT AAC TCC CCA 3314 
Pro Pro Gly Ser Pro Asp Leu Gly Arg His Tyr Arg Ser Asn Ser Pro 
925 930 935 940 

CTG CCT TCC ATC CAG CTG CAG CCC CAG TCA CCC TCA GCC TCC AAG AAG 3362 
Leu Pro Ser He Gin Leu Gin Pro Gin Ser Pro Ser Ala Ser Lys Lys 
945 950 955 

CAC CAG GTG GTA CAG GAC CTG CCA CCT GCA AAC ACA TTC GTG GGC ACC 3410 
His Gin val Val Gin Asp Leu Pro Pro Ala Asn Thr Phe Val Gly Thr 
960 965 g70 * 

GGG GAC ACC ACG TCC ACG GGC TCT GAG CAG TAC TCC GAC TAC AGC TAC 3458 
Gly Asp Thr Thr Ser Thr Gly Ser Glu Gin Tyr Ser Asp Tyr Ser Tyr 
975 980 985 

CGC ACC AAC CCC CCC AAA TAC CCC AGC AAG CAG GTA GGC CAG CCC TTT 3506 

Ar9 I«« ASn Pr ° Pr ° Lys Tvr Pro Ser LvB cln v »l Gly Gin Pro Phe 
990 995 1000 

CAG CTC AGC ACA CCC CAG CCC CTA CCC CAC CCC TAC CAC GGA GCC ATC 3554 
Gin Leu Ser Thr Pro Gin Pro Leu Pro His Pro Tyr His Cly Ala He 
1005 1010 1015 1020 

TGG ACC GAG GTG TGG GAG TGATGGAGCA GGTTTACTGT GCCTGCCCGT 360? 

Trp Thr Glu Val Trp Glu ,ow 
1025 

GTTGGCGGCC AGCCTGAGCC AGCAGTCGGA GGTGGGGCCT TAGTGCCTCA CCGGGCACAC 3662 

GGATTAGGCT GAGTGAAGAT TAACGGAGCG TGTCCTCTGT GGTCTCCTCC CTGCCCTCTC 3722 

CCCACTGGGG AGAGACCTGT CATTTGCCAA GTCCCTCGAC CCTGGACCAG CTACTGGGCC 3782 

TTATGGGTTG GGGGTGGTAG GCAGGTGAGC GTAAGTGGGG ACGGAAATGG GTAAGAACTC 3842 

TACTCCAAAC CTAGGTCTCT ATGTCAGACC AGACCTAGGT GCTTCTCTAG GAGGGAAACA 3902 

GGGAGACCTG GGCTCCTCTG CATAACTGAG TGGGG AG TCT GCCAGGGGAG GGCACCTTCC 3962 

CATTGTGCCT TCTGTGTGTA TTGTGCATTA ACCTCTTCCT CACCACTAGG CTTCTGGCGC 4022 

TGGGTCCCAC ATGCCCTTGA CCCTGACAAT AAACTTCTCT ATTTTTGGAA AAAAAAAAAA 4082 

AAAAAAAAAA AAAAAAAAAA AA 4104 
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(2) INFORMATION FOR SEQ ID NO: 95: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1026 amino acide 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Met Glu Pro Leu Arg His Ser Pro Gly Pro Gly Gly Gin Arg Leu Leu 
15 10 15 

Leu Pro Ser Met Leu Leu Ala Leu Leu Leu Leu Leu Ala Pro Ser Pro 
20 25 30 

Gly His Ala Thr Arg Val Val Tyr Lys Val Pro Glu Glu Gin Pro Pro 
35 40 45 

Asn Thr Leu lie Gly Ser Leu Ala Ala Asp Tyr Gly Phe Pro Aep Val 
50 55 60 

Gly His Leu Tyr Lys Leu Glu Val Gly Ala Pro Tyr Leu Arg Val Asd 
65 70 75 80 

Gly Lys Thr Gly Asp He Phe Thr Thr Glu Thr Ser He Asp Arg Glu 
85 90 95 

Gly Leu Arg Glu Cys Gin Asn Gin Leu Pro Gly Asp Pro Cys He Leu 
100 105 no 

Glu Phe Glu Val Ser He Thr Asp Leu Val Gin Asn Ala Ser Pro Arg 
115 120 125 

Leu Leu Glu Gly Gin He Glu Val Gin Asp He Asn Asp Asn Thr Pro 
130 135 140 

Asn Phe Ala Ser Pro Val He Thr Leu Ala He Pro Glu Asn Thr Asn 
145 150 155 160 

He Gly Ser Leu Phe Pro He Pro Leu Ala Ser Asp Arg Asp Ala Gly 
165 170 175 

Pro Asn Gly Val Ala Ser Tyr Glu Leu Gin Val Ala Glu Asp Gin Glu 
180 185 190 

Glu Lys Gin Pro Gin Leu He Val Met Gly Asn Leu Asp Arg Glu Arg 
195 200 205 

Trp Asp Ser Tyr Asp Leu Thr He Lys Val Gin Asp Gly Gly Ser Pro 
210 215 220 

Pro Arg Ala Thr Ser Ala Leu Leu Arg Val Thr Val Leu Aep Thr Asn 
225 230 235 240 

Asp Asn Ala Pro Lys Phe Glu Arg Pro Ser Tyr Glu Ala Glu Leu Ser 
245 250 255 

Glu Asn Ser Pro He Gly His Ser Val He Gin Val Lys Ala Asn Asp 
260 265 270 
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Ser Asp Gin Gly Ala Asn Ala Clu lie Glu Tyr Thr Phe His Gin Ala 

280 



285 



Pro Glu Val Val Arg Arg Leu Leu Arg Leu Asp Arg Asn Thr Gly Leu 



300 



lie Thr Val Gin Gly Pro Val Asp Arg Glu Asp Leu Ser Thr Leu Arg 

310 315 320 

Phe Ser Val Leu Ala Lys Asp Arg Gly Thr Asn Pro Lys Ser Ala Arg 

J25 330 

Ala Gin Val Val Val Thr Val Lys Asp Met Asn Asp Asn Ala Pro Thr 

j«»u 345 



350 



He Glu lie Arg Gly He Gly Leu Val Thr His Gin Aep Gly Met Ala 
3 360 365 

Asn lie Ser Glu Asp Val Ala Glu Glu Thr Ala Val Ala Leu Val Gin 

375 380 

val ser Asp Arg Asp Glu Gly Glu Asn Ala Ala Val Thr Cys Val Val 

390 395 400 

Ala Gly Asp Val Pro Phe Gin Leu Arg Gin Ala Ser Glu Thr Gly Ser 
405 410 41 | 

Asp Ser Lys Lys Lys Tyr Phe Leu Gin Thr Thr Thr Pro Leu Asp Tyr 
^ U 425 430 

Glu Lys Val Lys Asp Tyr Thr He Glu lie Val Ala Val Asp Ser Gly 



Asn Pro Pro Leu Ser Ser Thr Asn Ser Leu Lys Val Gin Val Val Asp 

455 460 

Val Asn Asp Asn Ala Pro Val Phe Thr Gin Ser Val Thr Glu Val Ala 

470 475 
Phe Pro Glu Asn Asn Lys Pro Gly Glu Val He Ala Glu He Thr Ala 



4 *° 495 



ser Asp Ala Asp Ser Gly Ser Asn Ala Glu Leu Val Tyr Ser Leu Clu 
500 505 510 

Pro Glu Pro Ala Ala Lys Gly Leu Phe Thr He Ser Pro Glu Thr Gly 

" 520 525 2 

Glu lie Gin Val Lys Thr Ser Leu Asp Arg Clu Gin Arg Clu Ser Tyr 



540 



Glu Leu Lys Val Val Ala Ala Asp Arg Gly Ser Pro Ser Leu Gin Gly 

555 5 6 o 

Thr Ala Thr Val Leu Val Asn Val Leu Asp Cys Asn Asp Asn Asp Pro 
565 570 575 

Lys Phe Met Leu Ser Gly Tyr Asn Phe Ser Val Met Clu Asn Met Pro 

585 590 
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Ala Leu Ser Pro Val Cly Met Val Thr Val lie Aop Gly Asp Lys Gly 
595 600 605 

Glu Asn Ala Gin Val Gin Leu Ser Val Glu Gin Asp Asn Gly Asp Phe 
610 615 620 

Val lie Gin Asn Gly Thr Gly Thr He Leu Ser Ser Leu Ser Phe Aso 
625 630 635 640 

Arg Glu Gin Gin Ser Thr Tyr Thr Phe Gin Leu Lys Ala Val Asp Gly 
645 650 655 

Gly Val Pro Pro Arg Ser Ala Tyr Val Gly Val Thr He Asn Val Leu 
660 665 670 

Asp Glu Asn Asp Asn Ala Pro Tyr He Thr Ala Pro Ser Asn Thr Ser 
675 680 685 

His Lys Leu Leu Thr Pro Gin Thr Arg Leu Gly Glu Thr Val Ser Gin 
690 695 700 

Val Ala Ala Glu Asp Phe Asp Ser Gly Val Asn Ala Glu Leu He Tyr 
705 710 715 720 

Ser He Ala Gly Gly Asn Pro Tyr Gly Leu Phe Gin He Gly Ser His 
725 730 735 

Ser Gly Ala He Thr Leu Glu Lys Glu lie Glu Arg Arg His His Gly 
740 745 750 

Leu His Arg Leu Val Val Lys Val Ser Asp Arg Cly Lys Pro Pro Arc 
755 760 765 

Tyr Gly Thr Ala Leu Val His Leu Tyr Val Asn Glu Thr Leu Ala Asn 
770 775 780 

Arg Thr Leu Leu Glu Thr Leu Leu Gly His Ser Leu Asp Thr Pro Leu 
785 790 795 800 

Asp He Asp He Ala Gly Asp Pro Glu Tyr Glu Arg Ser Lys Gin Arg 
805 810 815 

Gly Asn He Leu Phe Gly Val Val Ala Gly Val Val Ala Val Ala Leu 
"0 825 830 

Leu lie Ala Leu Ala Val Leu Val Arg Tyr Cys Arg Gin Arg Glu Ala 
835 840 845 

Lys Ser Gly Tyr Gin Ala Gly Lys Lys Glu Thr Lys Asp Leu Tyr Ala 
850 855 860 

Pro Lys Pro Ser Gly Lys Ala Ser Lys Cly Asn Lys Ser Lys Gly Lys 
865 870 875 880 

Lys Ser Lys Ser Pro Lys Pro Val Lys Pro Val Glu Asp Glu Asp Glu 
885 890 895 

Ala Gly Leu Gin Lys Ser Leu Lys Phe Asn Leu Met Ser Asp Ala Pro 
900 905 910 
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Gly Asp Ser Pro Arg He His Leu Pro Leu Asn Tyr Pro Pro Gly Ser 
915 920 925 

Pro Asp Leu Gly Arg His Tyr Arg Ser Asn Ser Pro Leu Pro Ser He 
930 935 940 

Gin Leu Gin Pro Gin Ser Pro Ser Ala Ser Lys Lys His Gin Val Val 
945 950 955 9 6 o 

Gin Asp Leu Pro Pro Ala Asn Thr Phe Val Gly Thr Gly Asp Thr Thr 
965 970 975 

Ser Thr Gly Ser Glu Gin Tyr Ser Asp Tyr Ser Tyr Arg Thr Asn Pro 
98 0 985 990 

Pro Lys Tyr Pro Ser Lys Gin Val Gly Gin Pro Phe Gin Leu Ser Thr 
995 1000 1005 

10?0 Pr ° iQ15 Tyr "?- Trp Thr Glu Val 



1020 



Trp Glu 
1025 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4705 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 115.. 2827 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

CGAAAGCCAT GTCGGACTCG TCGCCCAGCG CCCAAGCGCT AACCCGCTGA AAGTTTCTCA 

GCGAAATCTC AGCGACGATC TGGACCCCGC TGAGAGGAAC TGCTTTTGAG TGAG ATG 

Met 
1 

vll Pro Glu 111 c GC G ? A CTG GTA AGC ACC 000 AGG G ™ GTG 

val Pro Glu Ala Trp Arg Ser Gly Leu Val Ser Thr Gly Arg Val Val 

5 10 15 

SlJ SI? J TG f TT ? T ° CTT GGT CCC TTG AAC AAG GCT TCC ACG GTC ATT 
Gly val Leu Leu Leu Leu Gly Ala Leu Asn Lys Ala Ser Thr Val lie 

20 25 30 

5iS iyl Si? ill 22 ^ GA GAG *™ 007 TTC GCT GTG GGC AAC 

T %1 Glu Ile Pro Glu Glu Arg Glu Lys Gly Phe Ala Val Gly Asn 
" 40 45 



60 
117 

165 

213 

261 
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GTG GTC GCG AAC CTT GGT TTG GAT CTC GGT AGC CTC TCA GCC CGC AGG 309 
Val Val Ala Asn Leu Gly Leu Aep Leu Gly Ser Leu Ser Ala Arg Arg 
SO 55 60 65 

TTC CCG GTG GTG TCT GGA GCT AGC CGA AGA TTC TTT GAG GTG AAC CGG 357 
Phe Pro Val Val Ser Gly Ala Ser Arg Arg Phe Phe Glu Val Asn Arg 
70 75 80 

GAG ACC GGA GAG ATG TTT GTG AAC GAC CGT CTG GAT CGA GAG GAG CTG 405 
Glu Thr Gly Glu Met Phe Val Asn Asp Arg Leu Asp Arg Glu Glu Leu 
85 90 95 

TGT GGG ACA CTG CCC TCT TGC ACT GTA ACT CTG GAG TTG GTA GTG GAG 453 
Cys Gly Thr Leu Pro Ser Cys Thr Val Thr Leu Glu Leu Val Val Glu 
100 105 no 

AAC CCG CTG GAG CTG TTC AGC GTG GAA GTG GTC ATC CAG GAC ATC AAC 501 
Asn Pro Leu Glu Leu Phe Ser Val Glu Val Val lie Gin Asp He Asn 
115 120 125 

GAC AAC AAT CCT GCT TTC CCT ACC CAG GAA ATG AAA TTG GAG ATT AGC 549 
Asp Asn Asn Pro Ala Phe Pro Thr Gin Glu Met Lys Leu Glu He Ser 
130 135 140 145 

GAG GCC GTG GCT CCG GGG ACG CGC TTT CCG CTC GAC ACC GCG CAC GAT 597 
Glu Ala Val Ala Pro Gly Thr Arg Phe Pro Leu Glu Ser Ala His Asp 
150 155 160 

CCC GAT CTG GGA AGC AAC TCT TTA CAA ACC TAT GAG CTG AGC CGA AAT 645 
Pro Asp Leu Gly Ser Asn Ser Leu Gin Thr Tyr Glu Leu Ser Arg Asn 
165 170 175 

GAA TAC TTT GCG CTT CGC GTG CAG ACG CGG GAG GAC AGC ACC AAG TAC 693 
Glu Tyr Phe Ala Leu Arg Val Gin Thr Arg Glu Asp Ser Thr Lys Tyr 
180 185 190 

GCG GAG CTG GTG TTG GAG CGC GCC CTG GAC CGA GAA CGG GAG CCT AGT 741 
Ala Glu Leu Val Leu Glu Arg Ala Leu Asp Arg Glu Arg Glu Pro Ser 
195 200 205 

CTC CAG TTA GTG CTG ACG GCG TTG GAC GGA GGG ACC CCA GCT CTC TCC 789 
Leu Gin Leu Val Leu Thr Ala Leu Asp Gly Gly Thr Pro Ala Leu Ser 
210 215 220 225 

GCC AGC CTG CCT ATT CAC ATC AAG GTG CTG GAC GCG AAT GAC AAT GCG 837 
Ala Ser Leu Pro lie His He Lys Val Leu Asp Ala Asn Asp Asn Ala 
230 235 240 

CCT GTC TTC AAC CAG TCC TTG TAC CGG GCG CGC GTT CCT GGA GGA TGC 885 
Pro Val Phe Asn Gin Ser Leu Tyr Arg Ala Arg Val Pro Gly Gly Cys 
245 250 255 

ACC TCC GGC ACG CGC GTG GTA CAA GTC CTT GCA ACG GAT CTG GAT GAA 933 
Thr Ser Gly Thr Arg Val Val Gin Val Leu Ala Thr Asp Leu Asp Glu 
260 265 270 

GGC CCC AAC GGT GAA ATT ATT TAC TCC TTC GGC AGC CAC AAC CGC GCC 981 
Gly Pro Asn Gly Glu He He Tyr Ser Phe Gly Ser His Asn Arg Ala 
275 280 285 
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GGC GTG CGC CAA CTA TTC GCC TTA GAC CTT GTA ACC GGG ATG CTG ACA 
Gly Val Arg Gin Leu Phe Ala Leu Asp Leu Val Thr Cly Met Leu Thr 
290 295 300 305 

ATC AAG GGT CGG CTC GAC TTC GAG CAC ACC AAA CTC CAT GAG ATT TAC 
lie Lys Gly Arg Leu Asp Phe Clu Asp Thr Lys Leu Se g?S He £r 
310 315 320 

T?f ^ f f° *** GAC AAG GGC GCC *** CCC GAA GCA CAT TGC AAA 

lie Gin Ala Lya Asp Lye Gly Ala Asn Pro Glu Gly Ala Hie Cys Lya 
325 330 335 

GTG TTG GTG GAG GTT GTG GAT GTG AAT GAC AAC GCC CCG GAG ATC ACA 
val Leu val Glu Val Val Asp Val Asn Asp Asn Ala pTo cJS He i£ 

J4U 345 350 

vll iS IS vl? l A ° c GC CCA GTA CCC CAG CAT GCC TCT ACT GTC 

Val Thr Ser Val Tyr Ser Pro Val Pro Glu Asp Ala Ser Gly Thr Val 

J:>:> 360 365 

Hs 111 Jl G ? TC c CT ACT GAC CTG GAT GCT GAG AAC GGG CTG 

lie Ala Leu Leu Ser Val Thr Asp Leu Asp Ala Gly Glu Asn Gly Leu 

* 10 375 380 1 385 

GTG ACC TGC GAA GTT CCA CCG GGT CTC CCT TTC AGC CTT ACT TCT TCC 
Val Thr Cys Glu Val Pro Pro Gly Leu Pro Phe Se? ill ?£ s« lev 
390 395 400 

CTC AAG AAT TAC TTC ACT TTG AAA ACC ACT CCA GAC CTG GAT CGG GAG 
Leu Lys Asn Tyr Phe Thr Leu Lys Thr Ser Ala Asp Leu Asp Arg Glu 
4Ui 410 415 



ACT GTG CCA GAA TAC AAC CTC AGC ATC ACC GCC CGA GAC GCC CCA ACC 
Thr val Pro Glu Tyr Asn Leu Ser He Thr Ala Arg £p SS gj J£ 

425 430 

CCT TCC CTC TCA GCC CTT ACA ATA GTG CGT GTT CAA GTG TCC GAC ATC 
Pro Ser Leu Ser Ala Leu Thr He Val Arg Val Si ?2 K £J i2 
" 3 440 445 * 

AAT GAC AAC CCT CCA CAA TCT TCT CAA TCT TCC TAC GAC GTT TAC ATT 
Asn Asp Asn Pro Pro Gin Ser Ser Gin Ser Ser Tyr Asp ?S J2 

455 460 465 

ClJ C?J Jin ^ r CTC S CC °? G GCT CCA ATA CTA AAC CTA ACT GTC TGC 
Glu Glu Asn Asn Leu Pro Gly Ala Pro He Leu Asn Leu Ser Val Trp 

470 475 480 

GAC CCC GAC GCC CCG CAG AAT GCT CGG CTT TCT TTC TTT CTC TTC c»e 
Asp Pro Asp Ala Pro Gin Asn Ala Arg Leu Ser Se 25 22 SK 

* B5 490 495 

CAA CGA GCT GAA ACC GGG CTA GTG GGT CCC TAT TTC ACA ATA AAT CCT 
Gin Gly Ala Clu Thr Gly Leu Val Gly Arg Tyr ™ 5£ J2 ™ £g 
5,00 505 510 

Asp He SI? c CC P A GTG CCC CTA GAC TAT GAG ^AT CGG 

Asp Asn Gly He Val Ser Ser Leu Val Pro Leu Asp Tyr Glu Asp Arg 

520 525 



1029 



1077 



1125 



1173 



1221 



1269 



1317 



1365 



1413 



1461 



1509 



1557 



1605 



1653 



1701 
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CGG GAA TTT GAA TTA ACA GCT CAT ATC AGC GAT GGC GGC ACC CCG GTC 1749 

Arg Glu Phe Glu Leu Thr Ala His He Ser Asp Gly Gly Thr Pro Val 
530 535 540 545 

CTA GCC ACC AAC ATC AGC GTC AAC ATA TTT GTC ACT GAT CCC AAT GAC 1797 
Leu Ala Thr Asn He Ser Val Asn He Phe Val Thr Asp Arg Asn Asp 
550 555 560 

AAT GCC CCC CAG GTC CTA TAT CCT CGG CCA GGT GGG AGC TCG GTG GAG 1845 
Asn Ala Pro Gin Val Leu Tyr Pro Arg Pro Gly Gly Ser Ser Val Glu 
565 570 575 

ATG CTG CCT CGA GGT ACC TCA GCT CGC CAC CTA GTG TCA CGG GTG CTA 1893 
Met Leu Pro Arg Gly Thr Ser Ala Gly His Leu Val Ser Arg Val Val 
580 585 590 

GGC TGG GAC GCG GAT GCA GGG CAC AAT GCC TGG CTC TCC TAC AGT CTC 1941 
Gly Trp Asp Ala Asp Ala Gly His Asn Ala Trp Leu Ser Tyr Ser Leu 
595 600 605 

TTT GGA TCC CCT AAC CAG AGC CTT TTT GCC ATA GGG CTG CAC ACT GGT 1989 
Phe Gly Ser Pro Asn Gin Ser Leu Phe Ala He Gly Leu His Thr Glv 
610 615 620 625 

CAA ATC AGT ACT GCC CGT CCA GTC CAA GAC ACA GAT TCA CCC AGG CAG 2037 
Gin He Ser Thr Ala Arg Pro Val Gin Asp Thr Asp Ser Pro Arg Gin 
630 635 640 

ACT CTC ACT GTC TTG ATC AAA GAC AAT GGG CAG CCT TCG CTC TCC ACC 2085 
Thr Leu Thr Val Leu He Lys Asp Asn Gly Glu Pro Ser Leu Ser Thr 
645 650 655 

ACT GCT ACC CTC ACT GTG TCA GTA ACC GAG GAC TCT CCT GAA GCC CGA 2133 
Thr Ala Thr Leu Thr Val Ser Val Thr Glu Asp Ser Pro Glu Ala Ara 
660 665 670 

GCC GAG TTC CCC TCT GGC TCT GCC CCC CGC GAG CAG AAA AAA AAT CTC 2181 
Ala Glu Phe Pro Ser Gly Ser Ala Pro Arg Glu Gin Lys Lys Asn Leu 
675 680 685 

ACC TTT TAT CTA CTT CTT TCT CTA ATC CTG GTT TCT GTG GGC TTC GTG 2229 
Thr Phe Tyr Leu Leu Leu Ser Leu He Leu Val Ser Val Gly Phe Val 
690 695 700 705 

?r T ? *w A GT ? TTC CGA CTA ATC ATA TTC AAA GTT T AC AAG TGG AAG CAG 2277 
Val Thr Val Phe Gly Val He He Phe Lys Val Tyr Lys Trp Lye Gin 
710 715 720 

TCT AGA GAC CTA TAC CGA GCC CCG GTG AGC TCA CTG TAC CGA ACA CCA 2325 
Ser Arg Asp Leu Tyr Arg Ala Pro Val Ser Ser Leu Tyr Arg Thr Pro 
725 730 735 

GGG CCC TCC TTG CAC GCG GAC GCC GTG CGG GGA GGC CTG ATG TCG CCG 2373 
Gly Pro Ser Leu His Ala Asp Ala Val Arg Gly Gly Leu Met Ser Pro 
740 745 750 

CAC CTT TAC CAT CAG GTG TAT CTC ACC ACG GAC TCC CGC CGC AGC GAC 2421 
His Leu Tyr His Gin Val Tyr Leu Thr Thr Asp Ser Arg Arg Ser Asp 
755 760 765 
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CCG CTG CTG AAG AAA CCT GGT GCA GCC ACT CCA CTG GCC AGC CGC CAC 
Pro Leu Leu Lye Lye Pro Gly Ala Ala Ser Pro Leu Ala Ser Arg Gin 
770 775 780 vas 

AAC ACG CTG CGG AGC TGT GAT CCG GTG TTC TAT AGC CAG GTG TTG GGT 
Asn Thr Leu Arg Ser Cys Asp Pro Val Phe Tyr Arg Cln Val Leu Gly 
790 795 soo 

GCA GAG AGC GCC CCT CCC GGA CAG CAA GCC CCG CCC AAC ACC CAC Tee 
Ala Glu ser Ala Pro Pro Gly Cln Gin Ala P« S£ £n 55 £p S 
805 810 815 

CGT TTC TCT CAC GCC CAG AGA CCC GGC ACC AGC CGC TCC CAA AAT GCC 
Arg Phe Ser Gin Ala Gin Arg Pro Gly Thr Ser Gly Ser cJS X£ 
820 825 830 

GAT GAC ACC GGC ACC TGG CCC AAC AAC CAG TTT GAC ACA GAG ATG CTC 
Asp Asp Thr Gly Thr Trp Pro Aan Aan Cln Phe Asp ?£ cJu nit 2S 

oj:> 840 345 



2469 



2517 



2565 



2613 



2661 



2709 



2757 



2805 



G*£ J?? SI? tT° r TTG ^ G TCC GCC AGT GAA GCT GCT <»* GGG AGC TCC 
Gin Ala Met lie Leu Ala Ser Ala Ser Glu Ala Ala Asp Gly Ser Ser 

"° 855 860 865 

ACC CTG GGA GGG GGT GCC GGC ACC ATG GGA TTG AGC GCC CCC TAC GCA 
Thr Leu Gly Gly Gly Ala Gly Thr Met Gly Leu Ser Ala Sg ™ £J 
870 875 * 8 80 

CCC CAG TTC ACC CTG CAG CAC GTG CCC GAC TAC CGC CAG AAT GTC TAC 
Pro Gin Phe Thr Leu Gin His Val Pro Asp Tyr Arg Sn j£S vlt 
885 890 8g5 

ne Pro" Gly 2er ^ S£ £S T GACCAA <*<* GCTGGCAAGC CGATGGCAAG 
900 

GCCCAGCAGC TGGCAATGGC AACAAGAAGA AGTCGGCAAG AAGGAGAAGA AGTAACATGG 2917 

AGGCCAGGCC AAGAGCCACA GGCCAGCCTC TCCCCGAACC AGCCCAGCTT CTCCTTACCT 2977 

GCACCCAGGC CTCAGAGTTT CACGGCTAAC CCCCACAATA CTGGTAGCGG CCAAGGCATC 3037 

TCCCTTGGAA ACAGAAACAA GTGCCATCAC ACCATCCCTT CCCCAGGTGT AATATCCAAA 3097 

GCAGTTCCGC TGGGAACCCC ATCCAATCAG TGGCTGTACC CATTTGGGTA GTGCCCTTCA 3157 

TGTAGACACC AAGAACCATT TGCCACACCC CGTTTAGTTA CAG CTG AACC CTCCATCTTC 3217 

CAAATCAATC AGGCCCATCC ATCCCATGCC TCCCTCCTCC CCACCCCACT CCAACAGTTC 3277 

CTCTTTCCCG AG T AAGGTGG TTGGGGTGTT GAAGTACCAA CTAACCTACA AGCCTCCTAG 3337 

TTCTGAAAAG TTGGAAGGGC ATCATGACCT CTTGGCCTCT CCTTTGATTC TCAATCTTCC 3397 

CCCAAAGCAT GGTTTGGTGC CACCCCCTTC ACCTCCTTCC AGAGCCCAAG ATCAATG CTC 3457 

AAG TTTTGG A GGACATGATC ACCATCCCCA TCGTACTGAT GCTTCCTGCA TTTAGGGAGG 3517 

GCATTTTGCT ACCAAGCCTC TTCCCAACGC CCTGGGACCA GTCTTCTGTT TTGTTTTTCA 3577 

TTGTTTGAGC TTTCCACTGC ATGCCTTGAC TTCCCCCACC TCCTCCTCAA ACAAGAGACT 3637 
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CCACTGCATG TTCCAAGACA GTATGGGGTG GTAAGATAAG GAAGGGAAGT CTGTGGATGT 3697 

GGATGGTGGG GGCATGGACA AAGCTTGACA CATCAAGTTA TCAAGGCCTT GGAGGAGGCT 3757 

CTGTATGTCC TCAGGGGACT GACAACATCC TCCAGATTCC AGCCATAAAC CAATAACTAG 3817 

GCTGGACCCT TCCCACTACA TAATAGCGCT CAGCCAGGCA GCCAGCTTTG GGCTGAGCTA 3877 

ACAGGACCAA TGGATTAACT GGCATTTCAG TCCAAGGAAG CTCGAAGCAG GTTTAGGACC 3937 

AGGTCCCCTT GAGAGGTCAG AGGGG CCTCT GTGGGTGCTG GGTACTCCAG AGGTGCCACT 3997 

GGTGGAAGGG TCAGCGGAGC CCCAGCAGGA AGGGTGGGCC AGCCAGCCCA TTCTTAGTCC 4057 

CTGGGTTGGG GAGGCAGGGA GCTAGGGCAG GGACCAAATG AACAGAAAGT CTCAGCCCAG 4117 

GATGGGGCTT CTTCAACAGG CCCCTGCCCT CCTGAAGCCT CAGTCCTTCA CCTTGCCAGG 4177 

TGCCCTTTCT CTTCCGTGAA GGCCACTGCC CAGGTCCCCA GTGCGCCCCC TAGTGGCCAT 4237 

AGCCTGCTTA AACTTCCCCA GTGCCTCCTT GTGATAGACC TTCTTCTCCC ACCCCCTTCT 4297 

GCCCCTGGGT CCCCGGCCAT CCAGCGGGGC TGCCAGAGAA CCCCAGACCT GCCCTTACAG 4357 

TAGTGTAGCG CCCCCTCCCT CTTTCGGCTG GTGTAGAATA GCCAGTAGTG TAGTGCGGTG 4417 

TGCTTTTACG TGATGGCGGG TGGGCAGCGG GCGGCGGCGT CCGCGCAGCC GTCTGTCCTT 4477 

GATCTGCCCG CGGCGGCCCG TGTTGTGTTT TGTGCTGTGT CCAGCGCTAA GGCGACCCCC 4537 

TCCCCCGTAC TCACTTCTCC TATAAGCGCT TCTCTTCGCA TAG TCACG TA GCTCCCACCC 4597 

CACCCTCTTC CTGTGTCTCA CGCAAGTTTT ATACTCTAAT ATTTATATGG CTTTTTTTCT 4657 
TCGACAAAAA AATAATAAAA CGTTTCTTCT GAAAAAAAAA AAAAAAAA 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 904 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

Met Val Pro Glu Ala Trp Arg Ser Gly Leu Val Ser Thr Gly Arg Val 
1 5 10 is 

Val Gly Val Leu Leu Leu Leu Gly Ala Leu Asn Lye Ala Ser Thr Val 
20 25 30 

He His Tyr Glu He Pro Glu Glu Arg Glu Lye Gly Phe Ala Val Gly 
35 40 45 2 

Asn Val Val Ala Asn Leu Gly Leu Asp Leu Gly Ser Leu Ser Ala Arg 
50 55 60 * 



4705 



• 
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Arg Phe Pro Val Val Ser Cly Ala Ser Arg Arg Phe Phe Glu Val Asn 

65 70 75 80 

Arg Glu Thr Gly Glu Met Phe Val Asn Asp Arg Leu Asp Arg Glu Glu 
85 90 95 

Leu Cys Gly Thr Leu Pro Ser Cys Thr Val Thr Leu Glu Leu Val Val 
100 105 no 

Glu Asn Pro Leu Glu Leu Phe Ser Val Glu Val Val He Gin Asp He 
115 120 125 

Asn Asp Asn Asn Pro Ala Phe Pro Thr Gin Glu Met Lys Leu Glu He 
130 135 140 

Ser Glu Ala Val Ala Pro Gly Thr Arg Phe Pro Leu Glu Ser Ala His 
"5 150 155 160 

Asp Pro Asp Leu Gly Ser Asn Ser Leu Gin Thr Tyr Glu Leu Ser Arg 
165 170 175 

Asn Glu Tyr Phe Ala Leu Arg Val Gin Thr Arg Glu Asp Ser Thr Lys 
180 185 190 

Tyr Ala Glu Leu Val Leu Glu Arg Ala Leu Asp Arg Glu Arg Glu Pro 
195 200 205 

Ser Leu Gin Leu Val Leu Thr Ala Leu Asp Gly Gly Thr Pro Ala Leu 
210 215 220 

Ser Ala Ser Leu Pro He His He Lys Val Leu Asp Ala Asn Asp Asn 
225 230 235 240 

Ala Pro Val Phe Asn Gin Ser Leu Tyr Arg Ala Arg Val Pro Gly Gly 
245 250 255 

Cys Thr Ser Gly Thr Arg Val Val Gin Val Leu Ala Thr Asp Leu Asp 
260 265 270 

Glu Gly Pro Asn Gly Glu He He Tyr Ser Phe Gly Ser His Asn Arg 
275 280 285 

Ala Gly Val Arg Gin Leu Phe Ala Leu Asp Leu Val Thr Gly Met Leu 
290 295 300 

Thr He Lys Gly Arg Leu Asp Phe Glu Asp Thr Lys Leu His Glu He 
305 310 315 320 

Tyr He Gin Ala Lys Asp Lys Gly Ala Asn Pro Glu Gly Ala His Cys 
325 330 335 

Lys Val Leu Val Glu Val Val Asp Val Asn Asp Asn Ala Pro Glu He 
340 345 350 

Thr Val Thr Ser Val Tyr Ser Pro Val Pro Glu Asp Ala Ser Gly Thr 
355 360 365 

Val He Ala Leu Leu Ser Val Thr Asp Leu Asp Ala Gly Glu Asn Gly 
370 375 380 
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Leu Val Thr Cys Glu Val Pro Pro Gly Leu Pro Pho Ser Leu Thr Ser 
385 390 395 400 

Ser Leu Lys Asn Tyr Phe Thr Leu Lys Thr Ser Ala Asp Leu Asp Arg 
405 410 415 

Glu Thr Val Pro Glu Tyr Asn Leu Ser lie Thr Ala Arg Asp Ala Gly 
420 425 430 

Thr Pro Ser Leu Ser Ala Leu Thr He Val Arg Val Gin Val Ser Asp 
435 440 445 

lie Asn Asp Asn Pro Pro Gin Ser Ser Gin Ser Ser Tyr Asp Val Tvr 
450 455 460 

He Glu Glu Asn Asn Leu Pro Gly Ala Pro He Leu Asn Leu Ser Val 
465 470 475 480 

Trp Asp Pro Asp Ala Pro Gin Asn Ala Arg Leu Ser Phe Phe Leu Leu 
485 490 495 

Glu Gin Gly Ala Glu Thr Gly Leu Val Gly Arg Tyr Phe Thr He Asn 
500 505 510 

Arg Asp Asn Gly He Val Ser Ser Leu Val Pro Leu Asp Tyr Glu Asp 
515 520 525 

Arg Arg Glu Phe Glu Leu Thr Ala His He Ser Asp Gly Gly Thr Pro 
530 535 540 

Val Leu Ala Thr Asn He Ser Val Asn He Phe Val Thr Asp Arg Asn 
545 550 555 560 

Asp Asn Ala Pro Gin Val Leu Tyr Pro Arg Pro Gly Gly Ser Ser Val 
565 570 575 

Glu Met Leu Pro Arg Gly Thr Ser Ala Gly His Leu Val Ser Arg Val 
580 585 590 

Val Gly Trp Asp Ala Asp Ala Gly His Asn Ala Trp Leu Ser Tyr Ser 
595 600 60S 

Leu Phe Gly Ser Pro Asn Gin Ser Leu Phe Ala He Gly Leu His Thr 
610 615 620 

Gly Gin He Ser Thr Ala Arg Pro Val Gin Asp Thr Asp Ser Pro Aro 
625 630 635 640 

Gin Thr Leu Thr Val Leu He Lys Asp Asn Gly Glu Pro Ser Leu Ser 
645 650 655 

Thr Thr Ala Thr Leu Thr Val Ser Val Thr Glu Asp Ser Pro Glu Ala 
660 665 670 

Arg Ala Glu Phe Pro Ser Gly Ser Ala Pro Arg Glu Gin Lys Lye Asn 
675 680 685 

Leu Thr Phe Tyr Leu Leu Leu Ser Leu He Leu Val Ser Val Gly Phe 
690 695 700 



- 87- 

Val Val Thr Val Phe Gly Val He He Phe Lye Val Tyr Lya Trp Lvs 
70S 710 715 720 

Gin Ser Arg Asp Leu Tyr Arg Ala Pro Val Ser Ser Leu Tyr Arg Thr 
725 730 735 

Pro Gly Pro Ser Leu His Ala Aap Ala Val Arg Gly Gly Leu Met Ser 
740 745 750 

Pro His Leu Tyr His Gin Val Tyr Leu Thr Thr Asp Ser Arg Aro Ser 
755 760 765 

Asp Pro Leu Leu Lys Lys Pro Gly Ala Ala Ser Pro Leu Ala Ser Arg 
770 775 780 

Gin Asn Thr Leu Arg Ser Cys Asp Pro Val Phe Tyr Arg Gin Val Leu 
785 790 795 800 

Gly Ala Glu Ser Ala Pro Pro Gly Gin Gin Ala Pro Pro Asn Thr Asp 
805 810 815 

Trp Arg Phe Ser Gin Ala Gin Arg Pro Gly Thr Ser Gly Ser Gin Asn 
820 825 830 

Gly Asp Asp Thr Gly Thr Trp Pro Asn Asn Gin Phe Asp Thr Glu Met 
835 840 845 

Leu Gin Ala Met He Leu Ala Ser Ala Ser Glu Ala Ala Asp Gly Ser 
850 855 860 

Ser Thr Leu Gly Gly Gly Ala Gly Thr Met Gly Leu Ser Ala Aro Tvr 
865 870 875 * 880 

Gly Pro Gin Phe Thr Leu Gin His Val Pro Asp Tyr Arg Gin Asn Val 
885 890 895 

Tyr He Pro Gly Ser Asn Ala His 
900 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 441 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Asp Trp Val He Pro Pro He Asn Leu Pro Glu Asn Ser Arg Gly Pro 
1 5 io 15 

Phe Pro Gin Glu Leu Val Arg He Arg Ser Asp Arg Asp Lys Asn Leu 
20 25 30 
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Ser Leu Arg Tyr Thr Val Thr Gly Pro Gly Ala Asp Gin Pro Pro Thr 
35 40 45 

Gly lie Phe He He Asn Pro He Ser Gly Gin Leu Ser Val Thr Lye 
50 55 go 

Pro Leu Asp Arg Glu Gin He Ala Arg Phe His Leu Arg Ala His Ala 
65 7 ° 75 80 

Val Asp He Asn Gly Asn Gin Val Glu Asn Pro He Asp He Val He 
85 90 95 

Asn Val He Asp Met Asn Asp Asn Arg Pro Glu Phe Thr Ala Met Thr 
100 105 no 

Phe Tyr Gly Glu Val Pro Glu Asn Arg Val Asp He He Val Ala Asn 
115 120 125 

Leu Thr Val Thr Asp Lys Asp Gin Pro His Thr Pro Ala Trp Asn Ala 
130 135 140 

Val Thr Arg He Ser Gly Gly Asp Pro Thr Gly Arg Phe Ala He Gin 
145 150 155 160 

Thr Asp Pro Asn Ser Asn Asp Gly Leu Val Thr Val Val Lys Pro He 
165 170 175 

Asp Phe Glu Thr Asn Arg Met Phe Val Leu Thr Val Ala Ala Glu Asn 
I 80 185 190 

Gin Val Pro Leu Ala Lys Gly He Gin His Pro Pro Gin Ser Thr Ala 
1» 5 200 205 

Thr Val Ser Val Thr Val He Asp Val Asn Glu Asn Pro Tyr Phe Ala 

2io 215 220 

Pro Asn Pro Lys He He Arg Gin Clu Glu Gly Leu His Ala Gly Thr 
225 230 235 * 240 

Met Leu Thr Thr Phe Thr Ala Gly Asp Pro Asp Arg Tyr Met Gin Gin 
245 250 255 

Asn He Arg Tyr Thr Lys Leu Ser Asp Pro Ala Asn Trp Leu Lys He 
260 265 270 

Asp Pro Val Asn Gly Gin He Thr Thr He Ala Val Leu Asp Arg Glu 
275 280 285 

SSr «S A8n Val Lya A8n Aan Ile T V r Asn A1 * Thr Phe Leu Ala Ser 
290 295 300 

Asp Asn Gly He Pro Pro Met Ser Gly Thr Gly Thr Leu Gin He Tyr 
- 305 310 315 3 J 0 

Leu Leu Asp He Asn Asp Asn Ala Pro Gin Val Leu Pro Gin Glu Ala 
325 330 335 

Glu Thr Cys Glu Thr Pro Asp Pro Asn Ser He Asn lie Thr Thr Ala 
3*0 345 350 
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Leu Asp Tyr Asp He Asp Pro Asn Ala Gly Pro Phe Ala Tyr Asp Leu 
355 360 365 

Pro Leu Ser Pro Val Thr He Lye Arg Asn Trp Thr He Thr Arg Leu 
370 375 380 

Asn Gly Asp Phe Ala Gin Leu Asn Leu Lys He Lys Phe Leu Glu Ala 
385 390 395 400 

Gly He Tyr Glu Val Pro He He He Thr Asp Ser Gly Asn Pro Pro 
405 410 415 

Lys Ser Asn Lys Ser He Leu Arg Val Arg Val Cys Gin Cys Asp Phe 
420 425 430 

Asn Gly Asp CyB Thr Asp Val Asp Arg 
435 440 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 105 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:99: 

Glu Asp Thr Val Tyr Ser Phe Asp He Pro Glu Asn Ala Gin Arg Gly 
1 5 io 1 



15 



Tyr Gin Val Gly Gin He Val Ala Arg Asp Ala Asp Leu Gly Gin Asn 
20 25 30 

Ala Gin Leu Ser Tyr Gly Val Val Ser Asp Trp Ala Asn Asp Val Phe 
35 40 45 

Ser Leu Asn Pro Gin Thr Gly Met Leu Thr Leu Thr Ala Arg Leu Asp 
50 55 60 

Tyr Glu Glu Val Gin His Tyr He Leu He Val Gin Ala Gin Asp Asn 
65 ? 0 75 80 

Gly Gin Pro Ser Leu Ser Thr Thr Ho Thr Val Tyr Cys Asn Val Leu 
85 90 95 

Asp Leu Asn Asp Asn Ala Pro He Phe 
100 los 

(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



-90- 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Asp Xaa Asp Xaa Gly Xaa Asn 
1 5 

(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH x 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

Ala Xaa Asp Xaa Gly Xaa Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4650 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 495. ,4103 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

CCTCTATTCG ACATTCTCTT TGGATTGTTT TGCTATAACT TGAAATTTGG GATGTCACAA 60 

ACGAAACTGT CATCTGTTTC CGCCAAACTG TGCTTCTGCT AATCTCCCAG GCTGGCAGCA 120 

TTGGAGACTT GCTGACTTCT TTCATCCCCC ACTCTTTTCA CCTGAAATTC CTTTCCTTGG 180 

TTTTGCTCTA AGTCCTATGC TTCAGTCAGG GGCCAACCAA ATCTCACTGC CTCCTTTTTA 240 

TCATGAAGCC TTTGATCACT GATAGTTCTT TTTATATCTT GAAAAATCAC CCTTCCCAGT 300 

ACAGTTAATA TTTAGTATCT CTACTCATCT TGGCACTTAC TCACAGCTCC ATAATTCAGT 360 

CGTTTTCGTA CCTCTTCATG GTGATGGGGA GCCCTTTGGA GGTGGTGACT GTGCTTTATA 420 

CTCCTCATGA TGCTTCACAT GTGGCAGGCG TGGAGTGCCC GGAGGCGGCC CTCCTGATTC 480 
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TCCGGCCTCC CAGC ATG GAG CCC CTG AGG CAC AGC CCA GGC CCT CGC GCG 
Mat Glu Pro Leu Arg Hie Ser Pro Gly Pro Gly cly 
1 5 10 

CAA CGG CTA CTG CTG CCC TCC ATG CTG CTA GCA CTG CTG CTC CTG CTG 
Gin Arg Leu Leu Leu Pro Ser Met Leu Leu Ala Leu Leu Leu Leu Leu 
15 20 25 

GCT CCA TCC CCA GGC CAC GCC ACT CGG CTA GTG TAC AAG GTG CCC CAG 
Ala Pro Ser Pro Gly His Ala Thr Arg Val Val Tyr Lye Val Pro Glu 
30 35 40 

GAA CAG CCA CCC AAC ACC CTC ATT GGC AGC CTC GCA GCC CAC TAT GCT 
Glu Gin Pro Pro Asn Thr Leu lie Gly Ser Leu Ala Ala Asp Tyr Cly 
45 50 55 * 6 J 

TTT CCA CAT GTG GCG CAC CTG TAC AAG CTA GAG GTG GOT GCC CCG TAC 
Pne Pro Asp Val Gly His Leu Tyr Lys Leu Glu Val Gly Ala Pro Tvr 
fi 5 70 75 

CTT CGC GTG GAT GGC AAG ACA GOT CAC ATT TTC ACC ACC GAG ACC TCC 
Leu Arg Val Asp Gly Lys Thr Gly Asp lie Phe Thr Thr Glu Thr Ser 
80 85 90 

ATC GAC CGT GAG GGG CTC CGT GAA TGC CAG AAC CAG CTC CCT GCT GAT 
lie Asp Arg Glu Cly Leu Arg Glu Cys Gin Asn Gin Leu Pro Gly Asp 
95 100 105 

CCC TGC ATC CTG GAG TTT GAG GTA TCT ATC ACA GAC CTC GTG CAG AAT 
Pro Cys lie Leu Glu Phe Glu Val Ser lie Thr Asp Leu Val Gin Asn 

1X0 115 120 

GCG AGC CCC CGG CTG CTA GAG GGC CAG ATA GAA GTA CAA GAC ATC AAT 
Ala Ser Pro Arg Leu Leu Glu Gly Gin He Glu Val Gin Asp He Asn 
125 130 135 140 

A«£ £^ £k A o CC ^° T ? C GCC TCA CCA CTC ATG ACT CTG GCC ATC CCT 
Asp Asn Thr Pro Asn Phe Ala Ser Pro Val He Thr Leu Ala He Pro 

145 150 155 

r?° ^ C t?° ^ C A ?° 000 TCA CTC TTC CCC ATC CCG CTG GCT TCA GAC 
Glu Asn Thr Asn He Gly Ser Leu Phe Pro He Pro Leu Ala Ser Asp 
160 165 170 

tal !?I £? T n CC ™° °? T GTG GCA TCC TAT GAG CTG CAG GTG GCA 
Arg Asp Ala Gly Pro Asn Cly Val Ala Ser Tyr Glu Leu Gin Val Ala 
175 180 185 

tlu 2S rt° nt G ^° °^ CCA CAG CTC ATT GTG ATG CCC AAC CTG 

Glu Asp Gin Glu Glu Lys Gin Pro Gin Leu He Val Met Gly Aan Leu 

190 . 195 200 

til HI GGC J 00 GAC TCC TAT CAC CTC ACC ATC AAG GTG CAG GAT 

Asp Arg Glu Arg Trp Asp Ser Tyr Asp Leu Thr He Lys Val Gin Asp 
205 21 0 215 220 

C?^ c GC » CC GCA CGC GCC ACG AGT G CC CTG CTC CGT GTC ACC CTC 

Gly Gly Ser Pro Pro Arg Ala Thr Ser Ala Leu Leu Arg Val Thr Val 
225 230 235 



530 



578 



626 



674 



722 



770 



818 



866 



914 



962 



1010 



1058 



1106 



1154 



1202 
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CTT GAC ACC AAT GAC AAC GCC CCC AAG TTT GAG CGG CCC TCC TAT GAG 1250 
Leu Asp Thr Asn Asp Asn Ala Pro Lys Phe Glu Arg Pro Ser Tyr Glu 
240 245 250 

GCC CAA CTA TCT GAG AAT AGC CCC ATA GGC CAC TCG GTC ATC CAG GTC 1298 
Ala Glu Leu Ser Glu Asn Ser Pro He Gly His Ser Val He Gin Val 
255 260 265 

AAG GCC AAT GAC TCA GAC CAA GGT GCC AAT GCA GAA ATC GAA TAC ACA 1346 
Lys Ala Asn Asp Ser Asp Gin Gly Ala Asn Ala Glu He Glu Tvr Thr 
270 275 280 

TTC CAC CAG GCG CCC GAA GTT GTG AGC CGT CTT CTT CGA CTG GAC AGG 1394 
Phe His Gin Ala Pro Glu Val Val Arg Arg Leu Leu Arg Leu Asp Ara 
285 290 295 P 300 

AAC ACT GGA CTT ATC ACT GTT CAG GGC CCG GTG GAC CGT GAG GAC CTA 1442 
Asn Thr Gly Leu He Thr Val Gin Gly Pro Val Asp Arg Glu Asp Leu 
305 310 315 

AGC ACC CTG CGC TTC TCA GTG CTT GCT AAC CAC CGA GCC ACC AAC CCC 1490 
Ser Thr Leu Arg Phe Ser Val Leu Ala Lys Asp Arg Gly Thr Asn Pro 
320 325 330 

AAG ACT GCC CGT GCC CAG GTG GTT GTG ACC GTG AAG GAC ATG AAT GAC 1538 
Lys Ser Ala Arg Ala Gin Val Val Val Thr Val Lys Asp Met Asn Asp 
335 340 345 * 

AAT GCC CCC ACC ATT GAG ATC CGG GGC ATA GCC CTA GTC ACT CAT CAA 1586 
Asn Ala Pro Thr He Glu He Arg Gly He Gly Leu Val Thr His Gin 
350 355 360 

GAT GGG ATG GCT AAC ATC TCA GAG GAT GTG CCA GAG GAG ACA GCT GTG 1634 
Asp Gly Met Ala Asn He Ser Glu Asp Val Ala Glu Glu Thr Ala Val 
365 370 375 380 

GCC CTG GTG CAG GTG TCT GAC CGA GAT CAG GCA GAG AAT GCA GCT GTC 1682 
Ala Leu Val Gin Val Ser Asp Arg Asp Glu Gly Glu Asn Ala Ala Val 
385 390 395 

ACC TGT GTC GTG GCA GGT GAT GTG CCC TTC CAG CTG CGC CAG GCC ACT 1730 
Thr cys val Val Ala Gly Asp Val Pro Phe Gin Leu Arg Gin lit Ser 
400 40S 4io 

GAG ACA GGC ACT GAC AGC AAG AAG AAC TAT TTC CTG CAG ACT ACC ACC 1778 
Glu Thr Gly Ser Asp Ser Lys Lys Lys Tyr Phe Leu Gin Thr Thr Thr 
415 420 425 

CCG CTA GAC TAC GAG AAG GTC AAA GAC TAC ACC ATT GAG ATT GTG GCT 1826 
Pro Leu Asp Tyr Glu Lys Val Lys Asp Tyr Thr He Glu He Val Ala 
4 30 435 44Q 

GTG GAC TCT GGC AAC CCC CCA CTC TCC ACC ACT AAC TCC CTC AAG GTG 1874 
yal Asp ser Gly Asn Pro Pro Leu Ser Ser Thr Asn Ser Leu Lys Val 
445 450 455 Y 460 

rt!! S T ? ?, T ? f AC ?, T ? *** GAC AAC GCA CCT GTC "C ACT CAG ACT GTC 1922 
Gin val Val Asp Val Asn Asp Asn Ala Pro Val Phe Thr Gin Ser Val 
465 470 475 
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ACT GAG CTC GCC TTC CCG GAA AAC AAC AAG CCT GGT GAA CTG ATT GCT 1970 
Thr Glu Val Ala Phe Pro Glu Asn Aan Lye Pro Gly Glu Val lie Ala 
480 485 490 

GAG ATC ACT GCC ACT GAT GCT GAC TCT GGC TCT AAT GCT GAG CTG GTT 2018 
Glu He Thr Ala Ser Asp Ala Asp Ser Gly Ser Aan Ala Glu Leu Val 
495 500 505 

TAC TCT CTG GAG CCT GAG CCG GCT GCT AAG GGC CTC TTC ACC ATC TCA 2066 
Tyr Ser Leu Glu Pro Glu Pro Ala Ala Lys Gly Leu Phe Thr He Ser 
510 515 520 

CCC GAG ACT GGA GAG ATC CAG GTG AAG ACA TCT CTG GAT CGG GAA CAG 2114 
Pro Glu Thr Gly Glu He Gin Val Lys Thr Ser Leu Asp Arg Glu Gin 
525 530 535 540 

CGG GAG AGC TAT GAG TTG AAG GTG GTG GCA GCT GAC CGG GGC AGT CCT 2162 
Arg Glu Ser Tyr Glu Leu Lye Val Val Ala Ala Aep Arg Gly Ser Pro 
545 550 555 

AGC CTC CAG GGC ACA GCC ACT GTC CTT GTC AAT GTG CTG GAC TGC AAT 2210 
Ser Leu Gin Gly Thr Ala Thr Val Leu Val Asn Val Leu Asp Cys Asn 
560 565 570 

GAC AAT GAC CCC AAA TTT ATG CTG AGT GGC TAC AAC TTC TCA GTG ATG 2258 
Asp Asn Asp Pro Lys Phe Met Leu Ser Gly Tyr Asn Phe Ser Val Met 
575 580 585 

GAG AAC ATG CCA GCA CTC AGT CCA GTG GGC ATG GTG ACT CTC ATT GAT 2306 
Glu Asn Met Pro Ala Leu Ser Pro Val Gly Met Val Thr Val He Asp 
590 595 600 

GGA GAC AAG GGG GAG AAT GCC CAG GTG CAG CTC TCA GTG GAG CAG GAC 2354 
Gly Asp Lys Gly Glu Asn Ala Gin Val Gin Leu Ser Val Glu Gin Asp 
6 °5 610 615 620 



AAC GGT GAC TTT GTT ATC CAG AAT GGC ACA GGC ACC ATC CTA TCC AGC 
Asn Gly Asp Phe Val He Gin Asn Gly Thr Gly Thr He Leu Ser Ser 
625 630 635 
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CTG AGC TTT GAT CGA GAG CAA CAA ACC ACC TAC ACC TTC CAG CTG AAG 2450 
Leu Ser Phe Aep Arg Glu Gin Gin Ser Thr Tyr Thr Phe Gin Leu Lye 
640 645 650 

GCA GTG GAT GCT GGC GTC CCA CCT CGC TCA CCT TAC GTT GGT GTC ACC 2498 
Ala Val Aep Gly Gly Val Pro Pro Arg Ser Ala Tyr Val Gly Val Thr 
655 660 665 

ATC AAT GTG CTG GAC GAG AAT GAC AAC CCA CCC TAT ATC ACT GCC CCT 2546 
He Aen Val Leu Aep Glu Aen Aep Aen Ala Pro Tyr He Thr Ala Pro 
670 675 680 

TCT AAC ACC TCT CAC AAG CTG CTG ACC CCC CAG ACA CGT CTT GGT GAG 2594 
Ser Asn Thr Ser Hie Lye Leu Leu Thr Pro Gin Thr Arg Leu Gly Glu 
685 690 695 700 

ACC GTC AGC CAG GTG GCA GCC GAG GAC TTT GAC TCT GCT GTC AAT GCC 2642 
Thr Val Ser Gin Val Ala Ala Glu Asp Phe Asp Ser Gly Val Asn Ala 
705 710 715 
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GAG CTG ATC TAC AGC ATT GCA GGT GGC AAC CCT TAT GGA CTC TTC CAG 2690 
Glu Leu lie Tyr Ser lie Ala Gly Gly Aan Pro Tyr Gly Leu Phe Gin 
720 725 730 

ATT GGG TCA CAT TCA GGT GCC ATC ACC CTG GAG AAG GAG ATT GAG CGG 2738 
He Gly Ser His Ser Gly Ala He Thr Leu Glu Lya Glu He Glu Aro 
735 740 745 * 

CCC CAC CAT GGG CTA CAC CGC CTG GTG GTG AAG GTC AGT GAC CGC GGC 2786 
Arg Hie Hia Gly Leu Hia Arg Leu Val Val Lya Val Ser Aap Aro Glv 
750 755 760 

AAG CCC CCA CGC TAT GGC ACA GCC TTG GTC CAT CTT TAT GTC AAT GAG 2834 
Lya Pro Pro Arg Tyr Gly Thr Ala Leu Val Hia Leu Tyr Val Aan Glu 
765 770 775 780 

ACT CTG GCC AAC CGC ACG CTG CTG GAG ACC CTC CTG GGC CAC AGC CTG 2882 
Thr Leu Ala Aan Arg Thr Leu Leu Clu Thr Leu Leu Gly Hia Ser Leu 
785 790 795 

GAC ACG CCG CTG GAT ATT GAC ATT GCT GGG GAT CCA GAA TAT GAG CGC 2930 
Aap Thr Pro Leu Aap lie Aap He Ala Gly Asp Pro Glu Tyr Glu Aro 
800 805 810 

TCC AAG CAG CGT GGC AAC ATT CTC TTT GGT GTG GTG CCT GGT GTG GTG 2978 
Ser Lys Gin Arg Gly Aan He Leu Phe Gly Val Val Ala Glv Val Val 
815 820 825 

GCC GTG GCC TTG CTC ATC GCC CTG GCG CTT CTT GTG CGC TAC TGC AGA 3026 
la Y*i Ala Leu Leu Ile A l» Leu Ala val Leu Val Arg Tyr Cys Aro 
830 835 840 

CAG CGG GAG GCC AAA AGT GGT TAC CAC GCT GGT AAG AAG GAG ACC AAG 3074 
Gin Arg Glu Ala Lys Ser Gly Tyr Gin Ala Gly Lya Lya Glu Thr Lys 
84S 850 855 860 

GAC CTG TAT GCC CCC AAG CCC AGT GGC AAG GCC TCC AAG GGA AAC AAA 3122 
Asp Leu Tyr Ala Pro Lys Pro Ser Gly Lys Ala Ser Lys Gly Aan Lya 
865 870 875 

AGC AAA GGC AAG AAG AGC AAG TCC CCA AAG CCC GTG AAG CCA GTG GAG 3170 
Ser Lys Gly Lys Lys Ser Lys Ser Pro Lya Pro Val Lya Pro Val Glu 
880 885 890 

a- 0 , f AT S AG °? C 000 CTG CAG TCC CTC TTC AAC CTG ATG 3218 

Asp Glu Asp Glu Ala Gly Leu Gin Lys Ser Leu Lys Phe Asn Leu Met 
895 900 905 

AGC GAT GCC CCT GGC GAC ACT CCC CCC ATC CAC CTG CCC CTC AAC TAC 3266 

A ?E Pr ° Cly A8p Ser Pro Ile His Leu Pro Leu Asn Tyr 

910 915 920 

CCA CCA CGC AGC CCT GAC CTG GGC CGC CAC TAT CGC TCT AAC TCC CCA 3314 
Pro Pro Gly Ser Pro Asp Leu Cly Arg His Tyr Arg Ser Aan Ser Pro 
925 930 935 94Q 

Zlu Prl r?f CA ° CTC CAG CCC CAG TCA CCC TCA CCC TC C AAG AAG 3362 

Leu Pro ser Ile Gin Leu Gin Pro Gin Ser Pro Ser Ala Ser Lys Lys 

945 950 955 
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CAC CAG GTG GTA CAG GAC CTG CCA CCT GCA AAC ACA TTC GTG GGC ACC 
His Gin Val Val Gin Asp Leu Pro Pro Ala Asn Thr Phe Val Gly Thr 
960 965 970 



3410 



GGG GAC ACC ACG TCC ACG GGC TCT CAG CAG TAC TCC GAC TAC AGC TAC 
Gly Asp Thr Thr Ser Thr Gly Ser Glu Gin Tyr Ser Asp Tyr Ser Tvr 
975 980 985 

CGC ACC AAC CCC CCC AAA TAC CCC AGC AAG CAG TTA CCT CAC CGC CGC 
Arg Thr Asn Pro Pro Lys Tyr Pro Ser Lys Gin Leu Pro His Arg Arg 
990 995 1000 



3458 



3506 



GTC ACC TTC TCG GCC ACC AGC CAG GCC CAG GAG CTG CAG GAC CCA TCC 
Val Thr Phe Ser Ala Thr Ser Gin Ala Gin Glu Leu Gin Asp Pro Ser 
1005 1010 1015 1020 

CAG CAC AGT TAC TAT GAC AGT GGC CTG GAG GAG TCT GAG ACG CCG TCC 
Gin His Ser Tyr Tyr Asp Ser Gly Leu Glu Glu Ser Glu Thr Pro Ser 
1025 1030 1035 

AGC AAG TCA TCC TCA GGG CCT CGA CTC GGT CCC CTG GCC CTG CCT GAG 
Ser Lys Ser Ser Ser Gly Pro Arg Leu Gly Pro Leu Ala Leu Pro Glu 
1040 1045 1050 

GAT CAC TAT GAG CGC ACC ACC CCT GAT GGC AGC ATA GGA GAG ATG GAG 
Asp His Tyr Glu Arg Thr Thr Pro Asp Gly Ser lie Gly Glu Met Glu 
1055 1060 1065 

CAC CCC GAG AAT GAC CTT CGC CCT TTG CCT CAT GTC GCC ATG ACA GGC 
His Pro Glu Asn Asp Leu Arg Pro Leu Pro Asp Val Ala Met Thr Gly 
1070 1075 1080 



ACA TGT ACC CGG GAG TGC AGT GAG TTT GGC CAC TCT GAC ACA TGC TGG 
Thr Cys Thr Arg Glu Cys Ser Glu Phe Gly His Ser Asp Thr Cys Trp 
1085 1090 1095 1100 

ATG CCT GGC CAG TCA TCT CCC AGC CGC CGG ACC AAC AGC AGC GCC CTC 
Met Pro Gly Gin Ser Ser Pro Ser Arg Arg Thr Lys Ser Ser Ala Leu 
1105 mo ins 

AAA CTC TCC ACC TTC ATG CCT TAC CAG GAC CGA GGA GGG CAG GAG CCT 
Lys Leu Ser Thr Phe Met Pro Tyr Gin Asp Arg Gly Gly Gin Glu Pro 
1120 H25 H30 



3554 



3602 



3650 



3698 



3746 



3794 



3842 



3890 



1135 



1150 



1165 



AGC 


CCC 


AGC 


CCC 


CCG 


GAA 


GAC 


CGG 


AAC 


ACC 


AAA 


ACG 


3938 


Ser 


Pro 


Ser 


Pro 


Pro 


Glu 


Asp Arg 


Asn 


Thr 


Lys 


Thr 










1140 








1145 






CTC 


CTG 


CCC 


TCC 


TAC 


AGT 


GCC 


TTC 


TCC 


CAC 


AGT 


AGC 


3986 


Leu 


Leu 


Pro 


Ser 


Tyr 


Ser 


Ala 


Phe 


Ser 


His 


Ser 


Ser 








1155 








1160 










AAG 


GAC 


TCG 


GCC 


ACC 


TTG 


GAG 


GAA 


ATC 


CCC 


CTG 


ACC 


4034 


Lys 


Asp 


Ser 


Ala 


Thr 


Leu 


Glu 


Glu 


lie 


Pro 


Leu 


Thr 






1170 








1175 








1180 




TTC 


CCA 


CCC 


GCA 


GCC 


ACA 


CCG 


GCA 


TCT 


GCC 


CAG 


ACG 


4082 


Phe 


Pro 


Pro 


Ala 


Ala 


Thr 


Pro 


Ala 


Ser 


Ala 


Gin 


Thr 




1185 










1190 








1195 
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GCC AAG CGC GAG ATC TAC CTG TGAGCCCCCT ACTGGCCGGC CCCCCTCCCC 4133 
Ala Lys Arg Glu lie Tyr Leu 
1200 

CAGCGCCGGC CAGCTCCCAA ATGCCCATTC CAGGGCCTCA CTCTCCACCC CTTCAGCGTG 4193 

GACTTCCTGC CAGGGCCCAA GTGGGGGTAT CACTGACCTC ATGACCACGC TGGCCCTTCT 4253 

CCCATGCAGG GTCCAGGTCC TCTCCCCTCA TTTCCATCTC CCAGCCCAGG GGCCCCTTCC 4313 

CCTTTATGGG GCTTCCCCCA GCTGATGCCC AAGAGGGCTC CTCTGCAATG ACTGGGCTCC 4373 

TTCCCTTGAC TTCCAGGGAG CACCCCCTCG ATTTGGGCAG ATGGTGGAGT CAAGGGTGGG 4433 

CAGCGTACTT CTAACTCATT GTTTCCCTCA TGGCCGACCA GGGCGGGGAT AGCATGCCCA 4493 

ATTTTAGCCC TGAAGCAGGG CTGAACTGGG GAGCCCCTTT CCCTGGGAGC TCCCAGAGCA 4553 

AACTCTTGAC CACCAGTCGC TCCCTGAAGG GCTTTTGTTA CCAAACGTGG GGTAGGCACG 4613 

GGGGTGGGAG TGGAGCGGAG GCCTTGTTTT CCCGTGG 4650 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1203 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103 : 

Met Glu Pro Leu Arg His Ser Pro Gly Pro Gly Gly Gin Arg Leu Leu 
15 10 15 

Leu Pro Ser Met Leu Leu Ala Leu Leu Leu Leu Leu Ala Pro Ser Pro 
20 25 30 

Gly His Ala Thr Arg Val Val Tyr Lys Val Pro Glu Glu Gin Pro Pro 
35 40 



45 



Asn Thr Leu He Gly Ser Leu Ala Ala Asp Tyr Gly Phe Pro Asp Val 
50 55 60 

Gly His Leu Tyr Lys Leu Glu Val Gly Ala Pro Tyr Leu Arg Val Asp 
65 70 75 80 

Gly Lys Thr Gly Asp He Phe Thr Thr Glu Thr Ser He Asp Arg Glu 
85 90 95 

Gly Leu Arg Glu Cys Gin Asn Gin Leu Pro Gly Asp Pro Cys He Leu 
100 105 HO 

Glu Phe Glu Val Ser He Thr Asp Leu Val Gin Asn Ala Ser Pro Arg 
115 120 125 



-97- 

Leu Leu Clu Gly Gin lie Glu Val Gin Asp He Asn Asp Asn Thr Pro 
130 13S 140 

Asn Phe Ala Ser Pro Val He Thr Leu Ala He Pro Glu Asn Thr Aan 
14S 150 15S 160 

He Gly Ser Leu Phe Pro He Pro Leu Ala Ser Asp Arg Asp Ala Gly 
165 170 175 

Pro Asn Gly Val Ala Ser Tyr Glu Leu Gin Val Ala Glu Asp Gin Glu 
18° 185 190 

Glu Lys Gin Pro Gin Leu He Val Met Gly Asn Leu Asp Arg Glu Arg 
1 9S 200 205 

Trp Asp Ser Tyr Asp Leu Thr He Lys Val Gin Asp Gly Gly Ser Pro 
21° 215 220 

Pro Arg Ala Thr Ser Ala Leu Leu Arg Val Thr Val Leu Asp Thr Asn 
^ 25 23 ° 235 240 

Asp Asn Ala Pro Lys Phe Glu Arg Pro Ser Tyr Glu Ala Glu Leu Ser 
245 250 255 

Glu Asn ser Pro He Gly His Ser Val He Gin Val Lys Ala Asn Asp 
260 265 270 

Ser Asp Gin Gly Ala Asn Ala Glu He Glu Tyr Thr Phe His Gin Ala 
275 280 285 

Pr ° SJn Val Val Ar9 Ar9 Leu Leu Ar ' Leu AB P ^9 Asn Thr Gly Leu 
290 295 300 

He Thr Val Gin Gly Pro Val Asp Arg Clu Asp Leu Ser Thr Leu Arg 
305 31° 315 320 

Phe Ser Val Leu Ala Lys Asp Arg Gly Thr Asn Pro Lys Ser Ala Arg 
325 330 335 * 

Ala Gin Val Val Val Thr Val Lys Asp Met Asn Asp Asn Ala Pro Thr 
340 3 45 350 

He Glu lie Arg Gly He Gly Leu Val Thr His Gin Asp Gly Met Ala 
JSS 3 60 365 

Asn He Ser Glu Asp Val Ala Glu Glu Thr Ala Val Ala Leu Val Gin 

370 375 3eo 

Val ser Asp Arg Asp Clu Gly Glu Asn Ala Ala Val Thr Cys Val Val 
385 39 ° 395 400 

Ala Gly Asp Val Pro Phe Gin Leu Arg Gin Ala Ser Glu Thr Gly Ser 
40s 410 4is 

Asp Ser Lys Lys Lys Tyr Phe Leu Gin Thr Thr Thr Pro Leu Asp Tyr 
420 425 430 

Glu Lys Val Lys Asp Tyr Thr He Glu He Val Ala Val Asp Ser Gly 
435 440 445 * ' 



- 98 - 

Asn Pro Pro Leu Ser Ser Thr Asn Ser Leu Lys Val Gin Val Val Asn 
450 45S 460 

Val Asn Asp Asn Ala Pro Val Phe Thr Gin Ser Val Thr Glu Val Ala 
465 470 475 480 

Phe Pro Glu Asn Asn Lys Pro Gly Glu Val He Ala Glu He Thr Ala 
485 490 495 

Ser Asp Ala Asp Ser Gly Ser Asn Ala Glu Leu Val Tyr Ser Leu Glu 
500 505 510 

Pro Glu Pro Ala Ala Lys Gly Leu Phe Thr He Ser Pro Glu Thr Glv 
515 520 525 

Glu lie Gin Val Lys Thr Ser Leu Asp Arg Glu Gin Arg Glu Ser Tyr 
5J0 535 540 

Glu Leu Lys Val Val Ala Ala Asp Arg Gly Ser Pro Ser Leu Gin Cly 
545 550 555 560 

Thr Ala Thr Val Leu Val Asn Val Leu Asp Cys Asn Asp Asn Asp Pro 
565 570 57 | 

Lys Phe Met Leu Ser Gly Tyr Asn Phe Ser Val Met Glu Asn Met Pro 
580 585 590 

Ala Leu Ser Pro Val Gly Met Val Thr Val He Asp Gly Asp Lys Gly 
595 600 605 

Glu Asn Ala Gin Val Gin Leu Ser Val Glu Gin Asp Asn Gly Asp Phe 
610 615 620 

Val He Gin Asn Gly Thr Gly Thr He Leu Ser Ser Leu Ser Phe Asp 
625 630 635 640 

Arg Glu Gin Gin Ser Thr Tyr Thr Phe Gin Leu Lys Ala Val Asp Gly 
645 650 655 

Gly Val Pro Pro Arg Ser Ala Tyr Val Gly Val Thr He Asn Val Leu 
660 665 670 

Asp Glu Asn Asp Asn Ala Pro Tyr He Thr Ala Pro Ser Asn Thr Ser 
675 680 685 

His Lys Leu Leu Thr Pro Gin Thr Arg Leu Gly Glu Thr Val Ser Cln 
690 695 700 

Val Ala Ala Glu Asp Phe Asp Ser Gly Val Asn Ala Glu Leu He Tyr 
705 710 715 720 

Ser He Ala Gly Gly Asn Pro Tyr Gly Leu Phe Gin He Gly Ser His 
725 730 * 735 

Ser Gly Ala He Thr Leu Glu Lys Glu He Glu Arg Arg His His Gly 
740 745 750 * 

Leu His Arg Leu Val Val Lys Val Ser Asp Arg Gly Lys Pro Pro Arg 

755 760 765 
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Tyr Gly Thr Ala Leu Val His Leu Tyr Val Asn Clu Thr Leu Ala Asn 
770 775 780 

Arg Thr Leu Leu Glu Thr Leu Leu Gly His Ser Leu Asp Thr Pro Leu 
785 790 795 800 

Asp He Asp He Ala Gly Asp Pro Glu Tyr Glu Arg Ser Lys Gin Arg 
80S 810 815 

Gly Asn He Leu Phe Gly Val Val Ala Gly Val Val Ala Val Ala Leu 
820 825 830 

Leu He Ala Leu Ala Val Leu Val Arg Tyr Cys Arg Gin Arg Glu Ala 
835 840 845 

Lys Ser Gly Tyr Gin Ala Gly Lys Lys Glu Thr Lye Asp Leu Tyr Ala 
850 855 860 

Pro Lys Pro Ser Gly Lys Ala Ser Lys Gly Asn Lys Ser Lys Gly Lys 
865 870 875 880 

Lys Ser Lys Ser Pro Lys Pro Val Lys Pro Val Glu Asp Glu Asp Glu 
885 890 895 

Ala Gly Leu Gin Lye Ser Leu Lye Phe Asn Leu Met Ser Asp Ala Pro 
900 905 910 

Gly Asp Ser Pro Arg He His Leu Pro Leu Asn Tyr Pro Pro Glv Ser 
915 920 92S 

Pro Asp Leu Gly Arg His Tyr Arg Ser Asn Ser Pro Leu Pro Ser He 
930 935 940 

Gin Leu Gin Pro Gin Ser Pro Ser Ala Ser Lys Lys His Gin Val Val 
945 950 955 960 

Gin Asp Leu Pro Pro Ala Asn Thr Phe Val Gly Thr Gly Asp Thr Thr 
965 970 97S 

Ser Thr Gly Ser Glu Gin Tyr Ser Asp Tyr Ser Tyr Arg Thr Asn Pro 
980 985 990 

Pro Lys Tyr Pro Ser Lys Gin Leu Pro His Arg Arg Val Thr Phe Ser 
"5 1000 1005 

Ala Tof~ Ser Gln Ala Gln Glu Leu Gln As P p ro Ser Gin His Ser Tyr 
1010 1015 1020 

?^ C ASP Ser Gly Leu Glu Glu Ser Glu Thr Pr ° Ser Ser Lys Ser Ser 
1025 1030 1035 1040 

Ser Gly Pro Arg Leu Gly Pro Leu Ala Leu Pro Glu Asp His Tyr Glu 
1045 1050 1055 

Arg Thr Thr Pro Asp Gly Ser He Gly Glu Met Glu His Pro Glu Asn 
106 0 1065 1070 

Asp Leu Arg Pro Leu Pro Asp Val Ala Met Thr Gly Thr Cys Thr Arg 
107 5 1080 1085 



- 100- 

Glu Cys Ser Glu Phe Gly His Ser Asp Thr Cye Trp Met Pro Gly Gin 
1090 1095 1100 

Ser Ser Pro Ser Arg Arg Thr Lys Ser Ser Ala Leu Lys Leu Ser Thr 
H° 5 HIO His 1120 

Phe Met Pro Tyr Gin Asp Arg Gly Gly Gin Glu Pro Ala Gly Ala Gly 
1125 H30 H3S 

Ser Pro Ser Pro Pro Glu Asp Arg Asn Thr Lys Thr Ala Pro Val Ara 
1140 H45 H50 

Leu Leu Pro Ser Tyr Ser Ala Phe Ser His Ser Ser His Asp Ser Cvs 
1155 H60 H65 

Lys Asp Ser Ala Thr Leu Glu Glu He Pro Leu Thr Gin Thr Ser Aso 
1170 H75 H80 

Phe Pro Pro Ala Ala Thr Pro Ala Ser Ala Gin Thr Ala Lys Arg Glu 
1185 H90 H95 * 1200 

He Tyr Leu 



(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2789 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



( ix ) FEATURE : 

(A) NAME /KEY: CDS 

(B) LOCATION: 115.. 2622 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

CGAAAGCCAT GTCGGACTCG TCGCCCAGCG CCCAAGCGCT AACCCGCTGA AAGTTTCTCA 60 

GCGAAATCTC AGGGACGATC TGGACCCCGC TGAGAGGAAC TGCTTTTGAG TCAG ATC 117 

Met 
1 

GTC CCA GAG GCC TGG AGC AGC GGA CTG GTA AGC ACC GGG AGG GTA GTG 165 
Val Pro Glu Ala Trp Arg Ser Gly Leu Val Ser Thr Gly Arg Val Val 
5 10 15 

GGA CTT TTG CTT CTG CTT GGT GCC TTG AAC AAG GCT TCC ACG GTC ATT 213 
Gly Val Leu Leu Leu Leu Gly Ala Leu Asn Lys Ala Ser Thr Val lie 
20 25 30 

CAC TAT GAG ATC CCG GAG GAA AGA GAG AAG GGT TTC GCT CTG GCC AAC 261 
His Tyr Glu He Pro Glu Glu Arg Glu Lys Gly Phe Ala Val Gly Asn 
J5 40 45 



- 101 - 

GTG CTC CCC AAC CTT GGT TTG GAT CTC GOT AGC CTC TCA GCC CGC AGC 309 
Val Val Ala Asn Leu Gly Leu Asp Leu Gly Ser Leu Ser Ala Arg Arc 
50 55 60 6 I 

TTC CCG GTG GTG TCT GGA GCT AGC CGA ACA TTC TTT GAG GTG AAC CGC 357 
Phe Pro Val Val Ser Gly Ala Ser Arg Arg Phe Phe Glu Val Asn Ara 
7 ° 75 80 

£ AG £w C S? A GAG ATG TTT GTG AAC GAC CGT CTG G AT CGA GAG GAG CTG 405 
Glu Thr Gly Clu Met Phe Val Asn Asp Arg Leu Asp Arg Glu Glu Leu 
85 90 95 

l GT 5?° ACA CTG CCC TCT TCC ACT GTA A CT CTG GAG TTG GTA GTG GAG 453 
Cys Gly Thr Leu Pro Ser Cys Thr Val Thr Leu Glu Leu Val Val Glu 
100 105 no 

AAC CCG CTG GAG CTG TTC AGC GTG GAA CTG GTG ATC CAG GAC ATC AAC 501 
Asn Pro Leu Clu Leu Phe Ser Val Glu Val Val He Gin Asp iie 
liS 120 125 

GAC AAC AAT CCT GCT TTC CCT ACC CAG GAA ATG AAA TTC CAG ATT AGC 549 
Asp Asn Asn Pro Ala Phe Pro Thr Gin Glu Met Lye Leu Clu He Ser 
1-30 135 140 145 

GAG GCC GTG GCT CCG GGG ACG CGC TTT CCG CTC GAG AGC CCG CAC GAT 59 7 

Glu Ala val Ala Pro Gly Thr Arg Phe Pro Leu Glu Ser aI* SiJ tsl 
150 155 160 

CCC GAT CTG GGA AGC AAC TCT TTA CAA ACC TAT GAG CTG AGC CGA AAT 
Pro Asp Leu Gly Ser Asn Ser Leu Gin Thr Tyr Glu Leu Ser Arg Asn 
165 170 175 * 

GAA TAC TTT CCC CTT CCC GTG CAG ACC CCG CAC CAC ACC ACC AAC TAC 693 
Glu Tyr Phe Ala Leu Arg Val Gin Thr Arg Glu Asp Ser Thr Lye Tyr 

ISO 185 



255 



T^r IS £k G » GC GT ? GTA CAA GTC CTT CCA ACG GAT CTG GAT GAA 

Thr Ser Gly Thr Arg Val Val Gin Val Leu Ala Thr Asp Leu Asp Glu 
260 265 270 



Glv 72 ~ -7 I, C, , TAC TCC TTC GGC AGC CAC AAC CCC CCC 

Gly Pro Asn Gly Glu lie lie Tyr Ser Phe Gly Ser His Asn Arg Ala 

* ' 3 280 - — 



645 



190 

GCG GAG CTG GTG TTG GAG CGC GCC CTC GAC CGA GAA CGG GAG CCT ACT 
Ala Glu Leu Val Leu Glu Arg Ala Leu Asp Arg Glu Arg Clu Pro Ser 

200 205 

CTC CAG TTA GTG CTC ACG GCG TTG GAC CCA CGC ACC CCA GCT CTC TCC 
Leu Gin Leu Val Leu Thr Ala Leu Asp Gly Gly Thr Pro A?a sir 
^■ LO 215 220 225 

III Ser Su SS ^ St 0 r? C **** GTG 0X0 GAC GCG AA^ **AC AAT CCC 
Ala Ser Leu Pro lie His He Lye Val Leu Asp Ala Asn Asp Asn Ala 

230 235 240 

Pro vll pie JUS SK I CC P G l*° CGG GCG CGC GTT CCT «» TOA TCC •« 
Pro Val Phe Asn Gin Ser Leu Tyr Arg Ala Arg Val Pro Gly Cly Cys 

245 250 --- 



741 



789 



837 



933 



G ?5 o CC . ^ 2^ A F T ATT Z AC TCC TTC AGC CAC AAC CCC CCC 981 

Ser 
285 



GGC GTG CGG CAA CTA TTC GCC 
Gly Val Arg Gin Leu Phe Ala 
290 295 

ATC AAG GGT CGG CTG GAC TTC 
lie Lys Gly Arg Leu Asp Phe 
310 

ATC CAG GCC AAA GAC AAG GGC 
lie Gin Ala Lys Asp Lys Gly 
325 

GTG TTG GTG GAG GTT GTG GAT 
Val Leu Val Glu Val Val Asp 
340 

GTC ACC TCC GTG TAC AGC CCA 
Val Thr Ser Val Tyr Ser Pro 
355 3 6 o 

ATC GCT TTG CTC AGT GTG ACT 
lie Ala Leu Leu Ser Val Thr 
370 375 

GTG ACC TGC GAA GTT CCA CCG 
Val Thr Cys Glu Val Pro Pro 
390 

CTC AAG AAT TAC TTC ACT TTG 
Leu Lys Asn Tyr Phe Thr Leu 
405 



- 102 - 

TTA GAC CTT GTA ACC GGG ATG CTG ACA 
Leu Asp Leu Val Thr Gly Met Leu Thr 
300 305 

GAG GAC ACC AAA CTC CAT GAG ATT TAC 
Glu Asp Thr Lys Leu His Glu lie Tyr 
315 320 

GCC AAT CCC GAA GGA CCA CAT TGC AAA 
Ala Asn Pro Glu Gly Ala His Cys Lys 
330 335 Y 7 

GTG AAT GAC AAC GCC CCG GAG ATC ACA 
Val Asn Asp Asn Ala Pro Glu lie Thr 
345 350 

GTA CCC GAG GAT GCC TCT GGG ACT GTC 
Val Pro Glu Asp Ala Ser Gly Thr Val 
365 

GAC CTG GAT GCT GGC GAG AAC GGG CTG 
Asp Leu Asp Ala Gly Glu Asn Gly Leu 
380 385 

GGT CTC CCT TTC AGC CTT ACT TCT TCC 
Gly Leu Pro Phe Ser Leu Thr Ser Ser 
395 400 

AAA ACC AGT GCA GAC CTG GAT CGG GAG 
Lys Thr Ser Ala Asp Leu Asp Arg Glu 
410 415 



ACT GTG CCA GAA TAC AAC CTC 
Thr Val Pro Glu Tyr Asn Leu 
420 

CCT TCC CTC TCA GCC 
Pro Ser Leu Ser Ala 
435 

AAT GAC AAC CCT CCA 
Asn Asp Asn Pro Pro 
450 

GAA GAA AAC AAC CTC 
Glu Glu Asn Asn Leu 
470 

GAC CCC GAC GCC CCG 
Asp Pro Asp Ala Pro 
485 

CAA GGA GCT GAA ACC 
Gin Gly Ala Glu Thr 
500 

GAC AAT GGC ATA GTG TCA TCC 
Asp Asn Gly lie Val Ser Ser 
515 5 2 o 



CTT ACA 
Leu Thr 
440 

CAA TCT 
Gin Ser 
455 

CCC GGG 
Pro Gly 



CAG AAT 
Gin Asn 



GGG CTA 
Gly Leu 



AGC ATC ACC GCC CGA GAC GCC GGA ACC 
Ser He Thr Ala Arg Asp Ala Gly Thr 
425 430 

ATA GTG CGT GTT CAA GTC TCC GAC ATC 
He Val Arg Val Gin Val Ser Asp He 
445 

TCT CAA TCT TCC TAC GAC GTT TAC ATT 
Ser Gin Ser Ser Tyr Asp Val Tyr He 
460 465 

GCT CCA ATA CTA AAC CTA AGT GTC TGG 
Ala Pro He Leu Asn Leu Ser Val Trp 
475 480 

GCT CGG CTT TCT TTC TTT CTC TTG GAG 
Ala Arg Leu Ser Phe Phe Leu Leu Glu 
490 495 

GTG GGT CGC TAT TTC ACA ATA AAT CGT 
Val Gly Arg Tyr Phe Thr He Asn Arg 
505 510 

TTA GTG CCC CTA GAC TAT CAG GAT CGG 
Leu Val Pro Leu Asp Tyr Glu Asp Arg 
525 



1029 



1077 



1125 



1173 



1221 



1269 



1317 



1365 



1413 



1461 



1509 



1557 



1605 



1653 



1701 



- 103 - 



CCG GAA TTT GAA TTA ACA GCT CAT ATC AGC GAT GCG GGC ACC CCG GTC 1749 
Arg Glu Phe Glu Leu Thr Ala His He Ser Asp Gly Gly Thr Pro Val 
530 535 540 545 

CTA GCC ACC AAC ATC ACC GTG AAC ATA TTT GTC ACT GAT CGC AAT CAC 1797 
Leu Ala Thr Asn He Ser Val Asn He Phe Val Thr Asp Arg Aan Asp 
550 555 560 

AAT GCC CCC CAG GTC CTA TAT CCT CGG CCA GGT GGG AGC TCG GTG GAG 1845 
Aan Ala Pro Gin Val Leu Tyr Pro Arg Pro Gly Gly Ser Ser Val Glu 
565 570 575 

ATG CTG CCT CGA GGT ACC TCA GCT GGC CAC CTA GTG TCA CGG GTG CTA 1893 
Met Leu Pro Arg Gly Thr Ser Ala Gly His Leu Val Ser Arg Val Val 
580 585 590 

GGC TGG GAC GCG GAT GCA GGG CAC AAT GCC TGG CTC TCC TAC AGT CTC 1941 
Gly Trp Asp Ala Asp Ala Gly His Asn Ala Trp Leu Ser Tyr Ser Leu 
595 600 605 

TTT GGA TCC CCT AAC CAG AGC CTT TTT GCC ATA GGG CTG CAC ACT GGT 1989 
Phe Gly Ser Pro Asn Gin Ser Leu Phe Ala He Gly Leu His Thr Glv 
610 615 620 625 

CAA ATC AGT ACT GCC CGT CCA GTC CAA GAC ACA GAT TCA CCC AGG CAG 2037 
Gin He Ser Thr Ala Arg Pro Val Gin Asp Thr Asp Ser Pro Arg Gin 
630 635 640 

ACT CTC ACT GTC TTG ATC AAA GAC AAT GGG CAG CCT TCG CTC TCC ACC 2085 
Thr Leu Thr Val Leu He Lys Asp Asn Gly Glu Pro Ser Leu Ser Thr 
645 650 655 

ACT GCT ACC CTC ACT GTG TCA GTA ACC CAG GAC TCT CCT GAA GCC CGA 2133 
Thr Ala Thr Leu Thr Val Ser Val Thr Glu Asp Ser Pro Glu Ala Aro 
660 665 670 

GCC GAG TTC CCC TCT GGC TCT GCC CCC CGG GAG CAG AAA AAA AAT CTC 2181 
Ala Glu Phe Pro Ser Gly Ser Ala Pro Arg Glu Gin Lys Lys Asn Leu 
675 680 685 

ACC TTT TAT CTA CTT CTT TCT CTA ATC CTC GTT TCT GTG GGC TTC GTG 2229 
Thr Phe Tyr Leu Leu Leu Ser Leu He Leu Val Ser Val Gly Phe Val 
690 695 700 70S 

5 T ? £k* ?, T ? £3° °? A GTA ATC ATA TTC ^ GTT TAC ** G TCG AAC CAG 2277 
Val Thr Val Phe Gly Val He He Phe Lya Val Tyr Lys Trp Lys Gin 
710 715 720 

TCT AGA GAC CTA TAC CGA GCC CCG GTC AGC TCA CTG TAC CGA ACA CCA 2325 
Ser Arg Asp Leu Tyr Arg Ala Pro Val Ser Ser Leu Tyr Arg Thr Pro 
725 730 735 

GGG CCC TCC TTG CAC GCG GAC GCC GTG CGG GGA GGC CTC ATG TCG CCG 2373 
Gly Pro .Ser Leu His Ala Asp Ala Val Arg Gly Gly Leu Met Ser Pro 
740 745 750 

CAC CTT TAC CAT CAG GTG TAT CTC ACC ACG GAC TCC CGC CGC AGC GAC 2421 
His Leu Tyr His Gin Val Tyr Leu Thr Thr Asp Ser Arg Arg Ser Asp 
755 760 765 



- 104 - 

CCG CTG CTG AAG AAA CCT GGT GCA GCC ACT CCA CTG GCC AGC CGC CAG 2469 
Pro Leu Leu Lys Lye Pro Gly Ala Ala Ser Pro Leu Ala Ser Arg Gin 
770 775 780 7 8 5 

AAC ACG CTG CGG AGC TGT GAT CCG GTG TTC TAT AGG CAG GTG TTG GGT 2 517 

Aan Thr Leu Arg Ser Cye Asp Pro Val Phe Tyr Arg Gin Val Leu Gly 
790 795 800 

GCA GAG AGC GCC CCT CCC GGA CAG GTA AGG TTT AGC AAG TCA TGC TTG 2565 
Ala Glu Ser Ala Pro Pro Gly Gin Val Arg Phe Ser Lya Ser Cye Leu 
805 810 815 

ACC CTG TTA GTG CCT TTT TAT TCC TAC ATC ATA TTG AGA AGG CTG GAG 2613 
Thr Leu Leu Val Pro Phe Tyr Ser Tyr He He Leu Arg Arg Leu Glu 
820 825 830 

CTG TTT TTT TAGTGATGAA GATGTTTTCC TGGTGATGCA TTCACACTTT 2 663 

Leu Phe Phe *oo* 

835 

CAACTGGCTC TTCCTAGATC AAAGTTAGTG CCTTTGTGAG ATGGTGGCCT GCCAGAGTGT 2722 
GGTTTGTGGT CCCATTTCAG GGGGAAGATA CTTGACTCAT CTGTGGACCT AATTCACATC 2782 
CTCAGCG 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 836 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:105: 

Met Val Pro Glu Ala Trp Arg Ser Gly Leu Val Ser Thr Gly Arg Val 
1 5 10 15 

Val Gly Val Leu Leu Leu Leu Gly Ala Leu Asn Lye Ala Ser Thr Val 
20 25 30 

He His Tyr Glu He Pro Clu Clu Arg Glu Lye Gly Phe Ala Val Gly 
35 40 45 

Asn Val val Ala Asn Leu Gly Leu Asp Leu Gly Ser Leu Ser Ala Arg 
50 55 60 

Arg Phe Pro Val Val Ser Gly Ala Ser Arg Arg Phe Phe Glu Val Asn 
65 70 75 80 

Arg Glu Thr Gly Glu Met Phe Val Asn Asp Arg Leu Asp Arg Glu Glu 
85 90 95 

Leu Cys Gly Thr Leu Pro Ser Cys Thr Val Thr Leu Glu Leu Val Val 
1Q 0 105 no 

Glu Asn Pro Leu Glu Leu Phe Ser Val Glu Val Val He Gin Asp He 
115 120 125 



2789 



- 105 - 

Asn Asp Asn Asn Pro Ala Phe Pro Thr Gin Glu Mot Lys Leu Clu lie 
130 135 140 

Ser Glu Ala Val Ala Pro Gly Thr Arg Phe Pro Leu Glu Ser Ala His 
145 150 155 160 

Asp Pro Asp Leu Gly Ser Asn Ser Leu Gin Thr Tyr Glu Leu Ser Arg 
165 170 175 

Asn Glu Tyr Phe Ala Leu Arg Val Gin Thr Arg Glu Asp Ser Thr Lys 
180 185 190 

Tyr Ala Glu Leu Val Leu Glu Arg Ala Leu Asp Arg Clu Arg Glu Pro 
195 200 205 

Ser Leu Gin Leu Val Leu Thr Ala Leu Asp Gly Gly Thr Pro Ala Leu 
210 215 220 

Ser Ala Ser Leu Pro He His He Lys Val Leu Asp Ala Asn Asp Asn 
225 230 235 * 240 

Ala Pro Val Phe Asn Gin Ser Leu Tyr Arg Ala Arg Val Pro Gly Gly 
245 250 255 

Cys Thr Ser Gly Thr Arg Val Val Gin Val Leu Ala Thr Asp Leu Asp 
260 265 270 

Glu Gly Pro Asn Gly Glu He He Tyr Ser Phe Gly Ser His Asn Aro 
275 280 285 

Ala Gly Val Arg Gin Leu Phe Ala Leu Asp Leu Val Thr Gly Met Leu 
290 295 300 

Thr He Lys Gly Arg Leu Asp Phe Glu Asp Thr Lys Leu His Glu He 
305 310 315 320 

Tyr He Gin Ala Lys Asp Lys Gly Ala Asn Pro Glu Gly Ala His Cvs 
325 330 335 1 

Lys Val Leu Val Glu Val Val Asp Val Asn Asp Asn Ala Pro Glu He 
340 345 350 

Thr Val Thr Ser Val Tyr Ser Pro Val Pro Glu Asp Ala Ser Gly Thr 
355 360 365 

Val lie Ala Leu Leu Ser Val Thr Asp Leu Asp Ala Gly Glu Asn Gly 
37U 375 380 

Leu Val Thr Cys Glu Val Pro Pro Gly Leu Pro Phe Ser Leu Thr Ser 
385 390 395 400 

Ser Leu Lys Asn Tyr Phe Thr Leu Lys Thr Ser Ala Asp Leu Asp Arg 
405 410 415 

Glu Thr Val Pro Glu Tyr Asn Leu Ser He Thr Ala Arg Asp Ala Gly 
420 425 430 

Thr Pro Ser Leu Ser Ala Leu Thr He Val Arg Val Gin Val Ser Asp 
435 440 445 



- 106 - 

He Aan Asp Asn Pro Pro Gin Ser Ser Gin Ser Ser Tyr Asp Val Tvr 
450 455 460 

He Glu Glu Asn Asn Leu Pro Gly Ala Pro He Leu Asn Leu Ser Val 
465 470 475 480 

Trp Asp Pro Asp Ala Pro Gin Asn Ala Arg Leu Ser Phe Phe Leu Leu 
485 490 495 

Glu Gin Gly Ala Glu Thr Gly Leu Val Gly Arg Tyr Phe Thr He Asn 
500 505 510 

Arg Asp Asn Gly He Val Ser Ser Leu Val Pro Leu Asp Tyr Glu Asp 
515 520 525 

Arg Arg Glu Phe Glu Leu Thr Ala His He Ser Asp Gly Gly Thr Pro 
530 535 540 

Val Leu Ala Thr Asn He Ser Val Asn He Phe Val Thr Asp Arg Asn 
545 550 555 560 

Asp Asn Ala Pro Gin Val Leu Tyr Pro Arg Pro Gly Gly Ser Ser Val 
565 570 575 

Glu Met Leu Pro Arg Gly Thr Ser Ala Gly His Leu Val Ser Arg Val 
580 585 590 

Val Gly Trp Asp Ala Asp Ala Gly His Asn Ala Trp Leu Ser Tyr Ser 
595 600 605 

Leu Phe Gly Ser Pro Asn Gin Ser Leu Phe Ala He Gly Leu His Thr 
61 ° 615 620 

Gly Gin He Ser Thr Ala Arg Pro Val Gin Asp Thr Asp Ser Pro Arg 
625 630 635 640 

Gin Thr Leu Thr Val Leu He Lys Asp Asn Gly Glu Pro Ser Leu Ser 
64 5 650 655 

Thr Thr Ala Thr Leu Thr Val Ser Val Thr Glu Asp Ser Pro Glu Ala 
"0 665 670 

Arg Ala Glu Phe Pro Ser Gly Ser Ala Pro Arg Glu Gin Lys Lys Asn 
675 680 685 

Leu Thr Phe Tyr Leu Leu Leu Ser Leu He Leu Val Ser Val Gly Phe 
690 695 700 

val Val Thr Val Phe Gly Val He He Phe Lys Val Tyr Lys Trp Lys 
705 710 715 720 

Gin Ser Arg Asp Leu Tyr Arg Ala Pro Val Ser Ser Leu Tyr Arg Thr 
72 5 730 735 

Pro Gly Pro Ser Leu His Ala Asp Ala Val Arg Gly Gly Leu Met Ser 
740 745 750 

Pro His Leu Tyr His Gin Val Tyr Leu Thr Thr Asp Ser Arg Arg Ser 
755 760 765 



- 107 - 

Asp Pro Leu Leu Lys Lys Pro Gly Ala Ala Ser Pro Leu Ala Ser Aro 

770 775 780 

Gin Asn Thr Leu Arg Ser Cys Asp Pro Val Phe Tyr Arg Gin Val Leu 
785 790 795 800 

Gly Ala Glu Ser Ala Pro Pro Gly Gin Val Arg Phe Ser Lys Ser Cvs 
80S 810 815 

Leu Thr Leu Leu Val Pro Phe Tyr Ser Tyr lie He Leu Arg Arg Leu 
820 825 830 

Glu Leu Phe Phe 
835 

(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2751 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 115.. 2160 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
CG AAAGCCAT GTCGGACTCG TCGCCCAGCG CCCAAGCGCT AACCCGCTGA AAGTTTCTCA 60 

GCGAAATCTC AGGGACGATC TGGACCCCGC TGAGAGGAAC TGCTTTTGAG TGAG ATG 117 

Met 
1 

GTC CCA GAG GCC TGG AGG AGC GGA CTG GTA AGC ACC GGG AGG GTA GTG 165 
Val Pro Glu Ala Trp Arg Ser Gly Leu Val Ser Thr Gly Arg Val Val 
5 10 15 

GGA GTT TTG CTT CTG CTT GOT GCC TTG AAC AAG GCT TCC ACG GTC ATT 213 
Gly Val Leu Leu Leu Leu Gly Ala Leu Asn Lys Ala Ser Thr Val He 
20 25 30 

CAC TAT GAG ATC CCC GAG GAA AGA GAG AAG GGT TTC GCT GTC GGC AAC 261 
His Tyr Glu He Pro Glu Glu Arg Glu Lys Gly Phe Ala Val Gly Asn 
35 40 45 

SI? SI? ^ C f" °? T ^ CAT CTC «** ACC CTC *CA CCC CCC AGG 309 

Val Val Ala Asn Leu Gly Leu Aop Leu Gly Ser Leu Ser Ala Arg Arg 

° 55 60 65 



357 
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GAG ACC GGA GAG ATG TTT GTG AAC GAC CGT CTG GAT CGA GAG GAG CTG 
Glu Thr Gly Glu Met Phe Val Asn Asp Arg Leu Asp Arg Glu Glu Leu 



TGT GGG ACA CTG CCC TCT TGC ACT GTA ACT CTG GAG TTG GTA GTG GAG 
Cys Gly Thr Leu Pro Ser Cys Thr Val Thr Leu Glu Leu Val Val Glu 



AAC CCG CTG GAG CTG TTC AGC GTG GAA GTG GTG ATC CAG GAC ATC AAC 
ABn Pro Leu Glu Leu Phe Ser Val Glu Val Val lie Gin Asp lie Asn 
115 120 125 

GAC AAC AAT CCT GCT TTC CCT ACC CAG GAA ATG AAA TTG GAG ATT AGC 
Asp Asn Asn Pro Ala Phe Pro Thr Gin Glu Met Lys Leu Glu lie Ser 
130 135 140 145 

GAG GCC GTG GCT CCG GGG ACG CGC TTT CCG CTC GAG AGC GCG CAC GAT 
Glu Ala Val Ala Pro Gly Thr Arg Phe Pro Leu Glu Ser Ala His Asp 
150 155 160 

CCC GAT CTG GGA AGC AAC TCT TTA CAA ACC TAT GAG CTG AGC CGA AAT 
Pro Asp Leu Gly Ser Asn Ser Leu Gin Thr Tyr Glu Leu Ser Arg Asn 
165 170 175 

GAA TAC TTT GCG CTT CGC GTG CAG ACG CGG GAG GAC AGC ACC AAG TAC 
Glu Tyr Phe Ala Leu Arg Val Gin Thr Arg Glu Asp Ser Thr Lys Tyr 
180 185 190 

GCG GAG CTG GTG TTG GAG CGC GCC CTG GAC CGA GAA CGG GAG CCT AGT 
Ala Glu Leu Val Leu Glu Arg Ala Leu Asp Arg Glu Arg Glu Pro Ser 
195 200 205 

CTC CAG TTA GTG CTG ACG GCG TTG GAC GGA GGG ACC CCA GCT CTC TCC 
Leu Gin Leu Val Leu Thr Ala Leu Asp Gly Gly Thr Pro Ala Leu Ser 
210 215 220 225 

GCC AGC CTG CCT ATT CAC ATC AAG GTG CTG GAC GCG AAT GAC AAT GCG 
Ala Ser Leu Pro lie His lie Lys Val Leu Asp Ala Asn Asp Asn Ala 
230 235 240 

CCT GTC TTC AAC CAG TCC TTG TAC CGG GCG CGC GTT CCT GGA GGA TGC 
Pro Val Phe Asn Gin Ser Leu Tyr Arg Ala Arg Val Pro Gly Gly Cys 
245 250 255 

ACC TCC GGC ACG CGC GTG GTA CAA GTC CTT GCA ACG GAT CTG GAT GAA 
Thr Ser Gly Thr Arg Val Val Gin Val Leu Ala Thr Asp Leu Asp Glu 
260 265 270 

GGC CCC AAC GGT GAA ATT ATT TAC TCC TTC GGC AGC CAC AAC CGC GCC 
Gly Pro Asn Gly Glu lie lie Tyr Ser Phe Gly Ser His Asn Arg Ala 
275 280 285 

GGC GTG CGG CAA CTA TTC GCC TTA GAC CTT GTA ACC GGG ATG CTG ACA 
Gly Val Arg Gin Leu Phe Ala Leu Asp Leu Val Thr Gly Met Leu Thr 
290 295 300 305 

ATC AAG GGT CGG CTG GAC TTC GAG GAC ACC AAA CTC CAT GAG ATT TAC 
He Lys Gly Arg Leu Asp Phe Glu Asp Thr Lys Leu His Glu He Tyr 



405 
453 
501 
549 
597 
645 
693 
741 
789 
837 
885 
933 
981 
1029 
1077 



310 



315 



320 



2? is ss s sn s e s s a £ ™ - £ 2; 



525 



Ara Si Phi r TTA ACA GCT CAT ATC AGC GA T GCC CCC ACC CCC CTC 

Arg Glu Phe Clu Leu Thr Ala Hie lie Ser Asp Gly Gly Thr Sro vll 

SJS 540 545 



1125 



1173 



1221 
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?TS ^ *** CAC ^ 000 GCC ** T CCC GAA CCA CCA CAT TGC AAA 

He Gin Ala Lya Asp Lys Gly Ala Aan Pro Clu Gly Ala Hia Cya Lye 

GTG TTG GTG GAG GTT GTG GAT GTG AAT GAC AAC GCC CCG GAG ATC ACA 
val Leu Val Clu Val Val Aap Val Aan Aap Aan Ala Pro £S J£ 
J * u 345 35 0 

CTC ACC TCC GTG TAC AGC CCA GTA CCC GAG GAT GCC TCT GGG ACT CTC 
Val Thr ser Val Tyr Ser Pro Val Pro Glu Aap All 2J 5? J£ ?5 
J " 360 365 

ATC GCT TTG CTC ACT GTG ACT GAC CTC GAT GCT GCC GAG AAC GCC CTC , 3M 
lie Ala Leu Leu Ser Val Thr Aap Leu Aap Ala Gly c7u j£S £J 

J75 380 385 

w T ? A w C I GC GAA GTT CCA CCG CTC CCT TTC ACC CTT ACT TCT TCC 

Val Thr Cya Clu Val Pro Pro Gly Leu Pro Phe SeZ S IS £ 

■* 90 395 400 

CTC AAG AAT TAC TTC ACT TTG AAA ACC ACT CCA CAC CTC GAT CCC GAG 
Leu Lya Aan Tyr Phe Thr Leu Lya Thr Ser Ala Aap Lou Aap SK 

JS 55 £S SJ Jir r CTC AGC A ?° ACC ° CC CGA GAC GCC «» ACC 

JfS Tyr A8n Leu ! er Ile Tnr Ala Arg Aap Ala Gly Thr 

425 430 

CCT TCC CTC TCA GCC CTT ACA ATA GTG CGT CTT CAA GTG TCC GAC ATC 
Pro Ser Leu Ser Ala Leu Thr Ile Val Arg Val G?i 5S S Sp lit 

440 445 

AAT GAC AAC CCT CCA CAA TCT TCT CAA TCT TCC TAC GAC CTT TAC ATT 
Jan Aap Aan Pro Pro Gin Ser Ser Gin Ser Ser Tyr A?p 52 £r J™ 

460 465 

SJi s: £5 £S s s s? ss s S S £ 2 s JS s 

475 480 

GAC CCC GAC GCC CCC CAC AAT GCT CGC CTT TCT TTC TTT CTC TTC r»r 
Aap Pro Aap Ala Pro Cln Aan Ala Arg Leu ler ™ ™ 2S SiS 

490 495 

k sj e a s s; si s sj ss 3; s s s s s, 1 

505 520 



£ 22 S£ i£ S S? SS J2 SI? SS £ S J2 2? 

U 5 55 560 



1317 



1365 



1413 



1461 



1509 



1557 



1605 



1653 



1701 



1749 



1797 



• 110- 

AAT GCC CCC CAG GTC CTA TAT CCT CGG CCA GCT GGG AGC TCG GTG GAG 1845 
Asn Ala Pro Gin Val Leu Tyr Pro Arg Pro Gly Gly Ser Ser Val Glu 
565 570 575 

ATG CTG CCT CGA GGT ACC TCA GCT GGC CAC CTA GTG TCA CGG GTG GTA 1893 
Met Leu Pro Arg Gly Thr Ser Ala Gly His Leu Val Ser Arg Val Val 
580 585 590 

GGC TGG GAC GCG GAT GCA GGG CAC AAT GCC TGG CTC TCC TAC ACT CTC 1941 
Gly Trp Asp Ala A Bp Ala Gly His Asn Ala Trp Leu Ser Tyr Ser Leu 
595 600 60S 

TTT GGA TCC CCT AAC CAG AGC CTT TTT GCC ATA GGG CTG CAC ACT GGT 1989 
Phe Gly Ser Pro Asn Gin Ser Leu Phe Ala lie Gly Leu His Thr Glv 
"0 615 620 625 

CAA ATC AGT ACT GCC CGT CCA GTC CAA CAC ACA GAT TCA CCC AGG CAG 2037 
Gin He Ser Thr Ala Arg Pro Val Gin Asp Thr Asp Ser Pro Arg Gin 
630 635 640 

ACT CTC ACT GTC TTG ATC AAA GAC AAT GGG GAG CCT TCG CTC TCC ACC 2085 
Thr Leu Thr Val Leu He Lys Asp Asn Gly Glu Pro Ser Leu Ser Thr 
645 650 655 

ACT GCT ACC CTC ACT GTG TCA GTA ACC GAG GAC TCT CCT GAA GCC CGA 
Thr Ala Thr Leu Thr Val Ser Val Thr Glu Asp Ser Pro Glu Ala Ara 
660 665 670 

GCC GAG TTC CCC TCT GGC TCT GCC AGT TAAACCTTCT TTAATTATGG 2180 
Ala Glu Phe Pro Ser Gly Ser Ala Ser 
675 680 

ATTAGCCATT AACATTTTTG AAACCTGGAC CATTTAACCT CGGCCTACCC CCTCCAACTG 2240 

TCCTGGTGAT GAGTTCATTA GCTAAGTTAA ATTAATTGAA CTTTGATCTA AACCAAAACA 2300 

AAT C AGG AAA ATAAAGCTGT AAAGGAACTT ATCAAGCATT CCAAAACCAA CTAGAAATTA 2360 

CTTGAAGTTT CGAGTGAGCA TTGCCTGTGC CAGTATTCTT CATTATAGGA TT AT AAACT C 2420 

GTTTTTTTCC CAAAGCGCAT GTCTACGCCA GG CAG AGG AG TAATTATTCA GCCAATTTCA 2480 

TGGATGTAAC GATGGATATA AATAATTGAT AGCACCTAGA GGCTTCCAGT TTGGGTGGAA 2540 

GGCTAAAAGT AGAGGGGAAC TCACTCACTT GAGAAATGAT ATTTAAGTGA AT AAAT AG TT 2600 

CTCTTCTATG AAACTATTAC TATTTACTTC TCTCGAAAAC TTAAGTGTAT TAATGATTAG 2660 

AACATCAAAT CCTAAGTAAA GAAATGACAT TTTAAATATA AAAAGCCAAA CTTTAAATAA 2720 

ATCATAGAGA CCTCAGACAT AATATAGCAA A 2751 



2133 



(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 682 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



- Ill . 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

Met Val Pro Glu Ala Trp Arg Ser Gly Leu Val Ser Thr Gly Arg Val 
15 10 15 

Val Gly Val Leu Leu Leu Leu Gly Ala Leu Asn Lye Ala Ser Thr Val 
20 25 30 

lie His Tyr Glu lie Pro Glu Glu Arg Glu Lys Gly Phe Ala Val Gly 
35 40 45 

Asn Val Val Ala Asn Leu Gly Leu Asp Leu Gly Ser Leu Ser Ala Arg 
50 55 60 

Arg Phe Pro Val Val Ser Gly Ala Ser Arg Arg Phe Phe Glu Val Asn 
65 70 75 80 

Arg Glu Thr Gly Glu Met Phe Val Asn Asp Arg Leu Asp Arg Glu Glu 
85 90 95 

Leu Cys Gly Thr Leu Pro Ser Cys Thr Val Thr Leu Glu Leu Val Val 
100 105 110 

Glu Asn Pro Leu Glu Leu Phe Ser Val Glu Val Val lie Gin Asp lie 
115 120 125 

Asn Asp Asn Asn Pro Ala Phe Pro Thr Gin Glu Met Lys Leu Glu lie 
130 135 140 

Ser Glu Ala Val Ala Pro Gly Thr Arg Phe Pro Leu Glu Ser Ala His 
145 150 155 160 

Asp Pro Asp Leu Gly Ser Asn Ser Leu Gin Thr Tyr Glu Leu Ser Arg 
165 170 175 

Asn Glu Tyr Phe Ala Leu Arg Val Gin Thr Arg Glu Asp Ser Thr Lys 
180 185 190 

Tyr Ala Glu Leu Val Leu Glu Arg Ala Leu Asp Arg Glu Arg Glu Pro 
195 200 205 

Ser Leu Gin Leu Val Leu Thr Ala Leu Asp Gly Gly Thr Pro Ala Leu 
210 215 220 

Ser Ala Ser Leu Pro He His He Lys Val Leu Asp Ala Asn Asp Asn 
225 230 235 240 

Ala Pro Val Phe Asn Gin Ser Leu Tyr Arg Ala Arg Val Pro Gly Gly 
245 250 255 

Cys Thr Ser Gly Thr Arg Val Val Gin Val Leu Ala Thr Asp Leu Asp 
260 265 270 

Glu Gly Pro Asn Gly Glu He He Tyr Ser Phe Gly Ser His Asn Arg 
275 280 285 

Ala Gly Val Arg Gin Leu Phe Ala Leu Asp Leu Val Thr Gly Met Leu 
290 295 300 

Thr He Lys Gly Arg Leu Asp Phe Glu Asp Thr Lys Leu His Glu He 
305 310 315 320 
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Tyr lie Gin Ala Lys Asp Lys Gly Ala Asn Pro Glu Gly Ala His Cys 
325 330 335 

Lys Val Leu Val Glu Val Val Asp Val Asn Asp Asn Ala Pro Glu lie 
340 345 350 

Thr Val Thr Ser Val Tyr Ser Pro Val Pro Glu Asp Ala Ser Gly Thr 
355 360 365 

Val lie Ala Leu Leu Ser Val Thr Asp Leu Asp Ala Gly Glu Asn Gly 
370 375 380 

Leu Val Thr Cys Glu Val Pro Pro Gly Leu Pro Phe Ser Leu Thr Ser 
385 390 395 400 

Ser Leu Lys Asn Tyr Phe Thr Leu Lys Thr Ser Ala Asp Leu Asp Aro 
405 410 41S 

Glu Thr Val Pro Glu Tyr Asn Leu Ser He Thr Ala Arg Asp Ala Gly 
420 425 430 

Thr Pro Ser Leu Ser Ala Leu Thr He Val Arg Val Gin Val Ser Asp 
435 440 445 

He Asn Asp Asn Pro Pro Gin Ser Ser Gin Ser Ser Tyr Asp Val Tyr 
450 455 460 

He Glu Glu Asn Asn Leu Pro Gly Ala Pro He Leu Asn Leu Ser Val 
465 470 475 480 

Trp Asp Pro Asp Ala Pro Gin Asn Ala Arg Leu Ser Phe Phe Leu Leu 
485 490 495 

Glu Gin Gly Ala Glu Thr Gly Leu Val Gly Arg Tyr Phe Thr He Asn 
500 505 510 

Arg Asp Asn Gly He Val Ser Ser Leu Val Pro Leu Asp Tyr Glu Asp 
515 520 525 

Arg Arg Glu Phe Glu Leu Thr Ala His He Ser Asp Gly Gly Thr Pro 
530 535 540 

Val Leu Ala Thr Asn He Ser Val Asn He Phe Val Thr Asp Arg Asn 
545 550 555 560 

Asp Asn Ala Pro Gin Val Leu Tyr Pro Arg Pro Gly Gly Ser Ser Val 
565 570 575 

Glu Met Leu Pro Arg Gly Thr Ser Ala Gly His Leu Val Ser Arg Val 
580 585 590 

Val Gly Trp Asp Ala Asp Ala Gly His Asn Ala Trp Leu Ser Tyr Ser 
595 600 605 

Leu Phe Gly Ser Pro Asn Gin Ser Leu Phe Ala He Gly Leu His Thr 
610 615 620 

Gly Gin He Ser Thr Ala Arg Pro Val Gin Asp Thr Asp Ser Pro Aro 
625 630 635 640 
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Gin Thr Leu Thr Val Leu He Lys Asp Asn Cly Glu Pro Ser Leu Ser 




Thr Thr Ala Thr Leu Thr Val Ser Val Thr Glu Asp Ser Pro Glu Ala 




Arg Ala Glu Phe Pro Ser Gly Ser Ala Ser 



675 680 



(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2831 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cONA 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

GAATTCGGCA CGAGGCTGAA CTGAGGGTGA CGGACATAAA CGACTATTCT CCAGTGTTCA 6( 

GTGAAAGAGA AATGATACTG AGGATACCAG AAAACAGTGC TCGGGGAAAT ACATTCCCTT 12 < 

TAAACAATGC TCTGGACTCA GACGTAGATA TCAACAATAT CCAGACCTAT AGGCTCAGCT 18C 

CAAACTCTCA TTTCCTGGTT GTAACCCCCA ACCGCAGTGA TGGCAGGAAG TACCCACAGC 24C 

TGGTGCTGGA GAAAGAACTG GATCGAGAGG AGGAACCTGA GCTGAGGTTA ACGCTGACAG 30C 

CTTTGGATGG TGGCTCTCCT CCCCGGTCTG GGACGACACA GGTCCTCATT GAAGTAGTGG 36C 

ACACCAACGA TAATGCACCC GAGTTTCAGC AGCCAACATA CCAAGTGCAA ACTCCCGAGA 42 C 

ACAGTCCCAC CGGCTCTCTG GTACTCACAG TCTCAGCCAA TGACTTAGAC AGTGGAGACT 48C 

ATGGGAAAGT CTTGTACGCA CTTTCGCAAC CCTCAGAAGA TATTAGCAAA ACATTCGAGG 54C 

TAAACCCTGT AACCGGGGAA ATTCGCCTAC GAAAAGAGGT GAATTTTGAA ACTATTCCTT 600 

CGTATGAAGT GGTTATCAAG GGGACGGACG GGGGAGGTCT CTCAGGAAAA TGCACTCTGT 660 

TACTGCAGGT GGTGGACGTG AATGACAATG CCCCAGAAGT GATGCTATCT GCGCTAACCA 720 

ACCCAGTCCC AGAAAATTCC CCCGATGAGG TAGTGGCTGT TTTCAGTGTT AGAGATCCTG 780 

ACTCTGGGAA CAACGGAAAA CTGATTGCAT CCATCGAGGA AGACCTGCCC TTTCTTCTAA 840 

AATCTTCAGG AAAGAACTTT TACACTTTAG TAACCAAGGG AGCACTTGAC AGGGAAGAAA 900 

GAGAGCAATT GAACATCACC ATCACAGTCA CTGACCTGGG CATACCCAGG CTCACCACCC 960 

AACACACCAT AACAGTGCAC GTGGCAGACA TCAACCACAA TGCCCCCTCC TTCACCCAAA 1020 

CCTCCTACAC CATGTTTGTC CGCGAGAACA ACAGCCCCGC CCTGCACATA GCCACCATCA 1080 

GCGCCACAGA CTCAGACTCA GGATCCAATG CCCACATCAC CTACTCGCTG CTACCGCCCC 1140 
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AAGACCCACA 


GCTGGCCCTC 


GACTCGCTCA 


TCTCCATCAA 


TGTAGACAAC 


GGGCAGCTGT 


1200 


TCGCGCTCAG 


GGCGCTAGAC 


TATGAGGCTC 


TGCAGGGCTT 


CGAGTTCCAT 


GTGGGCGCCA 


1260 


CAGACCAAGG 


CTCGCCCGCG 


CTCAGCAGCC 


AGGCTCTGGT 


GCACGTGGTG 


GTGTTGGACG 


1320 


ACAATGACAA 


TGCGCCCTTC 


GTGCTCTACC 


CGCTGCAAAA 


CGCCTCTGCA 


CCCTTCACTG 


1380 


AGCTGCTGCC 


CAGGGCGGCA 


GAGCCTGGAT 


ACCTGGTTAC 


CAAGGTGGTA 


GCTGTGGACC 


1440 


GCGACTCTGG 


CCAGAATGCC 


TGGCTGTCAT 


TCCAGCTGCT 


GAAGGCCACG 


GAGCCCGGGC 


1500 


TGTTCAACGT 


ATGGGCGCAC 


AATGGCGAGG 


TACGCACCTC 


CAGGCTG CTG 


AGCGAGCGCG 


1560 


ACGCACCCAA 


GCACAAGCTG 


CTGCTGTTGG 


TCAAGGACAA 


TGGAGATCCT 


CCACGCTCTG 


1620 


CCAGTGTTAC 


TCTGCACGTG 


CTAGTGGTGG 


ATGCCTTCTC 


TCAGCCCTAC 


CTGCCTCTGC 


1680 


CAGAGGTGGC 


GCACGACCCT 


GCACAAGAAG 


AAGATGCGCT 


AACACT CT AC 


CTGGTCATAG 


1740 


CTTTGGCATC 


TGTGTCTTCT 


CTCTTCCTCT 


TGTCTGTGCT 


GCTGTTCGTG 


GGGGTGAGGC 


1800 


TCTGCAGGAG 


GGCCAGGGCA 


GCCTCTCTGA 


GTGCCTATTC 


TGTGCCTGAA 


GGCCACTTTC 


1860 


CTGGCCAGCT 


GGTGGATGTC 


AGAGGTATGG 


GGACCCTGTC 


CCAGAGCTAC 


CAGTATGATG 


1920 


TATGTCTGAT 


GGGGGATTCT 


TCTGGGACCA 


GCGAATTTAA 


CTTCTTAAAG 


CCAGTTCTGC 


1980 


CTAGCTCTCT 


GCACCAGTGC 


TCTGGGAAAG 


AAAT AG AGG A 


AAATTCCACA 


CTCCAGAATA 


2040 


GTTTTGGGTT 


TCATCATTAA 


TAGAAAACTA 


CTTTACAGAT 


ATTTAATTCC 


AAATATCATC 


2100 


TTGTTGATTA 


ACT AAAG T CT 


GTTCACATGT 


AGCTAGCTAG 


CAACGATTTT 


AATG TTCACT 


2160 


TTACCCATCT 


TTTTTCAGGG 


TCATGTCTAA 


AGCTACAAGT 


TTGNCTTTAC 


TTATACTTGT 


2220 


CGCACAGAAT 


NNNNNNNNNN 


TGGTGTATAA 


GTCACAGTCA 


TGGGATACTG 


GCACAAGATG 


2280 


GCAGCTTGAT 


TGCTCAGTTA 


TGGCTGCAAA 


GGGGNGCTTG 


AGTTTAGGGA 


ATGTGTTAGA 


2340 


GCTGGAATAA 


GTTTTCTGAG 


AAATGTGTAA 


GACAAATTTC 


TTTTGCACAT 


TCCCTGTGTT 


2400 


CCTGTACCCC 


TGTTTCCAGA 


ACTACGAAAT 


GTGTCATCAG 


AAGGCATGCT 


CACATTTTCC 


2460 


CCTTTGTTTG 


CGTGACCCGG 


GTGCCAGAAA 


TTAAATAAAA 


TTAGCATGGA 


GTTCAATGCA 


3520 


GCATTAAAAC 


AAAGTTACTT 


CTACAAACCT 


TTTATTCGAC 


GGTTAAAATT 


GTAACTTCCC 


2580 


CACCCATGAG 


GCTGGCTGTA 


AGAACCAGTA 


TGAATGGGTG 


TCTATCGCAA 


CCTTATTTTC 


2640 


AAAAATCAAA 


CAAAAGGAGA 


AATGAGAGAC 


CAAACAACAC 


GCTACAGGAA 


AGATTTCATA 


2700 


AGGATGTATG 


TATGGACACA 


AAAACTGGGA 


TACAGACATT 


TTAAATCTGT 


TGGTACCACA 


2760 


TGGTGGCGCT 


GCAGGCTAAA 


GAAATGCAAG 


GGAAATTAAA 


AAGAGGCTGA 


GCTAGAAGTC 


2820 


AAAAAAAAAA 


A 










2831 
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(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 S3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 



( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 763.. 3123 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109 : 

GTATTTTTCC ACAGTTTAAA ATTTTCATAA AATCATAACT CTCTCACTTT ATGTAGAAAG 60 

GATACCACAC TGGAATTAAC GTGTAGCTTT TTCTTGATGT AATCCAACCA ATGGGAGCAC 120 

AATTCTGGTA CATAGGCTGT CTAGAATTTG AAAGAAATTA AAGAATTCAT TTTCTTTTGC 180 

TGATAAATTT TTAAGAAATC ACGTGCCTTT ATCTTATTAT TATTACAAGA TGACTGATCA 240 

CTATTATGTC TTCTTTCACT TCTCAATTTC CCTCAGAACA CTACACCCAG ACTACACGCT 300 

CTGGAGGGTG GGGACCATGT CTGGGTTGTT TACTGATGTA TTTCATAATT TGGCACATAG 360 

AGACCAATAA TACTCCTTTA AATCAAGAAA TTAATAATTA CCATTGCGTG ATATTGTGAT 420 

TACATCATTT CCTCCCAATT TCCAAACTCC TAATACAATA GACAATAGAT CAATTGTAGC 480 

AATTCGTTTC GAAGCAAAGA CAACGCATGG TGCCGCTGCA GCCTAAGCCT TCAAAAAAAG 540 

GAAAAGGAAA AAGCCCATGA AATGCTACTA GCTACTTCAG ACCTCTTTCA GCCTAAGAGG 600 

AAAGCCTGTT AGCAGAGCAC GGACCAGTCT CTCCGGAGAA TGCTATTCTC CTACATTTCC 660 

GAACAGGTTA TCAACGCACA GATCGATCAC TGCCTCTGTC CCATCGCTCC CTGAAGTAGC 

TCTGACTCCG GTTCCTTGAA AGGGGCGTCT ACAGAAGTAA AO ATG GAG CCT GCA 

Met Glu Pro Ala 
1 

r?5 » GC IT « CC GAA AGG CAA CTC ATT CTC CTT CTT TTA 822 

Gly Glu Arg Phe Pro Glu Gin Arg Gin Val Leu He Leu Leu Leu Leu 
5 10 15 20 

T°I G r** f, T ? ~u T CTG GCA 000 TGG GAA CCC CGT CGC TAT TCT CTC ATG 870 
Leu Glu Val Thr Leu Ala Gly Trp Glu Pro Arg Arg Tyr Ser Val Met 
25 30 35 

rt G GAA £ CA S AG AGA 007 TCT TTT GTA GCC ** C CTG GCC AAT GAC CTA 918 
Glu Glu Thr Glu Arg Gly Ser Phe Val Ala Asn Leu Ala Asn Asp Leu 

40 45 50 

Git fl° °- T ? GCG GAG CTA GCC GAG CGG WA CCC CGC GTA GTT TCT 966 

Gly Leu Gly Val Gly Glu Leu Ala Glu Arg Gly Ala Arg Val Val Ser 
55 60 65 



720 
774 
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GAG GAT AAC CAA CAA GGC TTG CAG CTT GAT CTG CAG ACC GGG GAG TTG 
Glu Asp Asn Glu Gin Gly Leu Gin Leu Asp Leu Gin Thr Gly Gin Leu 
70 75 80 

ATA TTA AAT GAG AAG CTG GAC CGC GAG AAC CTG TCT GGC CCT ACT CAC 
lie Leu Asn Glu Ly. Leu Asp Arg Glu Ly. Leu J£ £J J£ ?S cJu 
85 90 95 ioo 

CCC TGT ATA ATG CAT TTC CAA GTG TTA CTG AAA AAA CCT TTG GAA GTA 
Pro Cys He Met His Phe Gin Val Leu Leu Lys Lys Pro Leu Glu Val 
105 110 us 

TTT CGA GCT GAA CTA CTA GTG ACA GAC ATA AAC GAT CAT TCT CCT CAC 
Phe Arg Ala Glu Leu Leu Val Thr Asp He Asn aI? Si! sS SS 22 
120 125 i3 0 

TTT CCT GAA AGA GAA ATG ACC CTG AAA ATC CCA GAA ACT AGC TCC CTT 
Phe Pro Glu Arg Glu Met Thr Leu Lys lie Pro Glu Thr Ser ler SI 

1Jb 140 145 

G?S JS vll Ph T S™ f TG ^ AAA CCT CGG GAC TTG CAC CTG GGC ACC 
Gly Thr Val Phe Pro Leu Lys Lys Ala Arg Aap Leu Asp Val Gly Ser 
1SU 155 160 

AAT AAT GTT CAA AAC TAC AAT ATT TCT CCC AAT TCT CAT TTC CAT CTT 
Asn Asn Val Gin Asn Tyr Asn He Ser Pro Asn Ser Si Jle HLs Sal 

170 175 180 

TCC ACT CGC ACC CGA GGG CAT CGC ACG AAA TAC CCA CAG CTG GTG CTG 
ser Thr Arg Thr Arg Gly Asp Gly Arg Lys Tyr Pro Glu Su Val Su 
185 190 i9 5 

GAC ACA GAA CTG GAT CGC GAG GAG CAG GCC GAG CTC AGA TTA ACC TTG 
Asp Thr Glu Leu Asp Arg Glu Glu Gin Ala Glu LeS Zg Si J5 SS 

^ UU 205 210 

J2 111 ?S Asn S3 S° S CA CCC CCA TCT 000 ACC CTC CAG ATC 

Tnr Ala Val Asp Gly Gly Ser Pro Pro Arg Ser Gly Thr Val Cln He 

220 225 

CTC ATC TTG GTC TTG GAC GCC AAT GAC AAT GCC CCG GAG TTT GTG CAG 
Leu lie Leu Val Leu Asp Ala Asn Asp Asn Ala Pro SlS ?S ?2 £n 

235 240 

f TC S AC 2 AC GTG CAG GTC CCA GAG AAC AGC CCA CTA GGC TCC CTA 
Ala Leu Tyr Glu Val Cln Val Pro Glu Asn Ser Pro S3 Oly Ser SI 

250 255 260 

vll SaT fls 5? IS !fl t AT P A CAC ACT 000 ACA AAT GGA CAC 

val Lys Val Ser Ala Arg Asp Leu Asp Thr Gly Thr Asn Gly Glu 

265 270 275 

He 52 Xr S^r SI ?* ^ C A ° C TCT CAG CAC ATA GAC AAA CCT TTT 
Ser Tyr Ser Leu Tyr Tyr Ser Ser Cln Clu He Asp Lys Pro Phe 

* BO 285 290 

GAG CTA AGC AGC CTT TCA CGA GAA ATT CGA CTA ATT AAA AAA CTA CAT 
Glu Leu ser Ser Leu Ser Gly Glu He Arg Leu J™ JJs i£ SJ a£ 



1014 



1062 



1110 



1158 



1206 



1254 



1302 



1350 



1398 



1446 



1494 



1542 



1590 



1638 



1686 



# 
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TTT GAG ACA ATG TCT TCA TAT GAT CTA GAT ATA GAG GCA TCT GAT GGC 1734 
Phe Glu Thr Met Ser Ser Tyr Aap Leu Aep He Glu Ala Ser Asp Gly 

315 320 

GGG GGA CTT TCT GGA AAA TGC TCT CTC TCT GTT AAG GTG CTG GAT CTT 
Gly Gly Leu Ser Gly Lys Cys Ser Val Ser Val Lye Val Leu Asp Val 

325 330 335 * 



340 



1782 



1830 



1878 



1926 



1974 



AAC GAT AAC TTC CCG GAA CTA ACT ATT TCA TCA CTT ACC AGC CCT ATT 
Asn Asp Asn Phe Pro Glu Leu Ser He Ser Ser Leu Thr Ser Pro He 
3 « 350 355 

CCC GAG AAT TCT CCA GAG ACA GAA GTG GCC CTG TTT AGO ATT ACA GAC 
Pro Glu Asn Ser Pro Glu Thr Glu Val Ala Leu Phe Arg He Arg Asp 
360 365 370 

CGA GAC TCT GGA GAA AAT GGA AAA ATG ATT TGC TCA ATT CAG GAT GAT 
Arg Asp Ser Gly Glu Asn Gly Lys Met lie Cys Ser He Gin Asp Asp 
375 380 385 

GTT CCT TTT AAG CTA AAA CCT TCT GTT GAG AAT TTC TAC AGG CTG GTA 
Val Pro Phe Lys Leu Lys Pro Ser Val Glu Asn Phe Tyr Arg Leu Val 
390 395 400 

?S 2tJ ??S r CTG GAC ? GA GAG ACC AGA GCC CAG TAC AAC ATC ACC 2022 

Thr Glu Gly Ala Leu Asp Arg Clu Thr Arg Ala Glu Tyr Asn He Thr 

405 410 41S 420 

r? C A u C A ?° ACA GAC TTG 000 ACT CCA AGG CTG AAA ACC GAG CAG AGC 
He Thr He Thr Asp Leu Gly Thr Pro Arg Leu Lye Thr Clu Gin Ser 
425 430 435 

ATA ACC GTG CTG GTG TCG GAC GTC AAT GAC AAC GCC CCC GCC TTC ACC 
lie Thr Val Leu Val Ser Asp Val Asn Asp Asn Ala Pro Ala iEZ Thr 
440 445 450 

GAA ACC l CC * AC ACC CTG TTC GTC CGC GAG AAC AAC AGC CCC GCC CTG 
Gin Thr Ser Tyr Thr Leu Phe Val Arg Glu Asn Asn Ser Pro Ala Leu 
455 460 465 

nil lit S?S A f T S 7 ? AGC GCC ACA GAC AGA CAC TCG GCC ACC AAC GCC 2214 

J™ Y SSr Val S6r A i? Thr Afl P Aa P Ser cl y Thr Asn Ala 

* /u 475 480 

Gin vl? T CC - 1*° l CG CTG CCG CCC GAC CCC CAC CTG CCC CTA 2262 

Gin Val Thr Tyr Ser Leu Leu Pro Pro Gin Asp Pro His Leu Pro Leu 
485 490 495 500 

ACC TCC CTG GTC TCC ATT AAC ACC CAC AAC GGC CAC CTC TTC OCT CTC 231Q 
Thr ser Leu Val Ser He Asn Thr Asp Asn Gly Hi. m£ SI " 10 

5 °5 510 515 

CAG TCG CTG GAC TAC GAG GCC CTG CAG GCT TTC GAG TTC CGC GTG GCC 
Gin Ser Leu Asp Tyr Glu Ala Leu Gin Ala Phe Glu Phe Arg Val Gly 
520 525 530 

Sa ? C J SSS ll C » CG GCG CTG AGC AGC G AG GCG CTG GTC CCA 2406 

Ala Thr Asp Arg Gly Phe Pro Ala Leu Ser Ser Glu Ala Leu Val Arg 
5J5 540 545 



2070 



2118 



2166 



2358 
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GTG CTG GTG CTG GAC GCC AAC GAC AAC TCG CCC TTC GTG CTG TAC CCG 2454 

Val Leu Val Leu Asp Ala Asn Asp Asn Ser Pro Phe Val Leu Tyr Pro 

550 555 560 

CTG CAG AAC GGC TCC GCG CCC TGC ACC GAG CTG GTG CCC CGG GCG GCC 2502 
Leu Gin Asn Gly Ser Ala Pro Cys Thr Glu Leu Val Pro Arg Ala Ala 
565 570 575 580 

GAG CCG GGC TAC CTG GTG ACC AAG GTG GTG GCG GTG GAC GGC GAC TCG 2550 
Glu Pro Gly Tyr Leu Val Thr Lys Val Val Ala Val Asp Gly Asp Ser 
585 590 595 

GGC CAG AAC GCC TGG CTG TCG TAC CAG CTG CTC AAG GCC ACG GAG CCC 2598 
Gly Gin Asn Ala Trp Leu Ser Tyr Gin Leu Leu Lyo Ala Thr Glu Pro 
600 605 610 

GGG CTG TTC GGC GTG TGG GCG CAC AAT GGC GAG GTG CGC ACC CCC AGG 2646 
Gly Leu Phe Gly Val Trp Ala His Asn Gly Glu Val Arg Thr Ala Aro 
615 620 625 

CTG CTG AGC GAG CGC GAC GTG GCC AAG CAC AGG CTA GTG GTG CTG GTC 2694 
Leu Leu Ser Glu Arg Asp Val Ala Lys His Arg Leu Val Val Leu Val 
630 635 640 

AAG GAC AAT GGC GAG CCT CCG CGC TCG GCC ACA GCC ACG CTC CAA GTG 2742 
Lys Asp Asn Gly Glu Pro Pro Arg Ser Ala Thr Ala Thr Leu Gin Val 
645 650 655 660 

CTC CTG GTG GAC GGC TTC TCT CAG CCC TAC CTG CCG CTC CCA GAG GCG 2790 
Leu Leu Val Asp Gly Phe Ser Gin Pro Tyr Leu Pro Leu Pro Glu Ala 
665 670 675 

GCC CCG GCC CAA GCC CAG GCC GAC TCG CTT ACC GTC TAC CTG GTG GTG 2838 
Ala Pro Ala Gin Ala Gin Ala Asp Ser Leu Thr Val Tyr Leu Val Val 
680 685 690 

GCA TTG GCC TCG GTG TCT TCG CTC TTC CTC TTC TCG GTG TTC CTC TTC 2886 
Ala Leu Ala Ser Val Ser Ser Leu Phe Leu Phe Ser Val Phe Leu Phe 
695 700 705 

GTG GCA GTG CGG CTG TGC AGG AGG AGC AGG GCG GCC TCA GTG GCT CGC 2934 

Val Leu Cy8 Ser Ala Ala Ser Val Gly Arg 

710 715 720 

TGC TCG GTG CCC GAG GGC CCC TTT CCA GGG CAT CTG GTG GAC GTG AGC 2982 
Cys Ser Val Pro Glu Gly Pro Phe Pro Gly His Leu Val Asp Val Ser 
725 730 735 740 

GGC ACC GGG ACC CTT TCC CAG AGC TAC CAG TAC GAG GTG TCT CTG ACG 3030 
Gly Thr Gly Thr Leu Ser Gin Ser Tyr Gin Tyr Glu Val Cys Leu Thr 
745 750 755 

GGA GGC TCT GAA ACT AAT GAT TTC AAG TTC TTG AAG CCT ATA TTC CCA 3078 
Gly Gly Ser Glu Ser Asn Asp Phe Lys Phe Leu Lys Pro He Phe Pro 
760 765 770 

AAT ATT GTA AGC CAG GAC TCT AGG AGG AAA TCA GAA TTT CTA GAA 3123 
Asn He Val Ser Gin Asp Ser Arg Arg Lys Ser Glu Phe Leu Glu 
775 780 785 



TAATGTAGGT ATCTGTAGCT TTCCGACCGT CTGTTAATTT TGTCTTCCTC ACTTTTCACC 3183 
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TTAGTTTTTT TTAACCCTTT AGTAATCTTG AATTCTACTT TTTTTTAAAT TTCTACTGTT 3243 

GTCTTTAGTA ATGTTACTCA TTTCCTTTCT CTGATTGTTA GTTTTCAAAT TATTGTATTA 3303 

TTATAAATAT TTTATATCAG GAAACTTCAT ATTTCTGAAT AAATTAATAG 3353 

(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 787 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

Met Glu Pro Ala Gly Glu Arg Phe Pro Glu Gin Arg Gin Val Leu lie 
15 10 15 

Leu Leu Leu Leu Leu Glu Val Thr Leu Ala Gly Trp Glu Pro Arg Arg 
20 25 30 

Tyr Ser Val Met Glu Glu Thr Glu Arg Gly Ser Phe Val Ala Asn Leu 
35 40 45 

Ala Asn Asp Leu Gly Leu Gly Val Gly Glu Leu Ala Glu Arg Gly Ala 
50 .55 60 

Arg Val Val Ser Glu Asp Asn Glu Gin Gly Leu Gin Leu Asp Leu Gin 
65 70 75 80 

Thr Gly Gin Leu lie Leu Asn Glu Lys Leu Asp Arg Glu Lys Leu Cye 
85 90 95 

Gly Pro Thr Glu Pro Cys He Met His Phe Gin Val Leu Leu Lys Lys 
100 105 no 

Pro Leu Glu Val Phe Arg Ala Glu Leu Leu Val Thr Asp He Asn Asp 
115 120 125 

His Ser Pro Glu Phe Pro Glu Arg Glu Met Thr Leu Lys He Pro Glu 
130 135 140 

Thr Ser Ser Leu Gly Thr Val Phe Pro Leu Lys Lys Ala Arg Asp Leu 
145 150 155 '160 

Asp Val Gly Ser Asn Asn Val Gin Asn Tyr Asn He Ser Pro Asn Ser 
165 170 175 

His Phe His Val Ser Thr Arg Thr Arg Gly Asp Gly Arg Lys Tyr Pro 
180 185 190 

Glu Leu Val Leu Asp Thr Glu Leu Asp Arg Clu Glu Gin Ala Glu Leu 
195 200 205 

Arg Leu Thr Leu Thr Ala Val Asp Gly Gly Ser Pro Pro Arg Ser Glv 
210 215 220 



- 120 - 

Thr Val Gin lie Leu He Leu Val Leu Asp Ala Asn Asp Asn Ala Pro 
225 2 30 235 240 

Glu Phe Val Gin Ala Leu Tyr Glu Val Gin Val Pro Glu Asn Ser Pro 
245 250 255 

Val Gly Ser Leu Val Val Lye Val Ser Ala Arg Asp Leu Asp Thr Gly 
260 265 270 

Thr Asn Gly Glu He Ser Tyr Ser Leu Tyr Tyr Ser Ser Gin Glu He 
275 280 285 

Asp Lys Pro Phe Glu Leu Ser Ser Leu Ser Gly Glu He Arg Leu He 
290 295 300 

Lys Lys Leu Asp Phe Glu Thr Met Ser Ser Tyr Asp Leu Asp He Glu 
305 310 3is 320 

Ala Ser Asp Gly Gly Gly Leu Ser Gly Lys Cys Ser Val Ser Val Lvs 
325 330 335 

Val Leu Asp Val Asn Asp Asn Phe Pro Glu Leu Ser He Ser Ser Leu 
340 345 350 

Thr Ser Pro He Pro Glu Asn Ser Pro Glu Thr Glu Val Ala Leu Phe 
355 360 365 

Arg lie Arg Asp Arg Asp Ser Gly Glu Asn Gly Lys Met He Cys Ser 
370 375 380 

He Gin Asp Asp Val Pro Phe Lys Leu Lys Pro Ser Val Clu Asn Phe 
385 390 395 400 

Tyr Arg Leu Val Thr Glu Gly Ala Leu Asp Arg Glu Thr Arg Ala Clu 
405 410 * 4i 5 

Tyr Asn He Thr He Thr He Thr Asp Leu Gly Thr Pro Arg Leu Lye 
4 20 425 430 

Thr Glu Gin Ser He Thr Val Leu Val Ser Asp Val Asn Asp Asn Ala 
* JS 440 445 

Pro Ala Phe Thr Gin Thr Ser Tyr Thr Leu Phe Val Arg Glu Asn Asn 
* so 455 460 

Ser Pro Ala Leu His He Gly Ser Val Ser Ala Thr Asp Arg Asp Ser 
465 470 475 480 

Gly Thr Asn Ala Gin Val Thr Tyr Ser Leu Leu Pro Pro Gin Asp Pro 
485 490 495 

His Leu Pro Leu Thr Ser Leu Val Ser He Asn Thr Asp Asn Gly His 
500 505 sio 

Leu Phe Ala Leu Gin Ser Leu Asp Tyr Glu Ala Leu Gin Ala Phe Glu 
515 520 525 

Phe Arg Val Gly Ala Thr Asp Arg Gly Phe Pro Ala Leu Ser Ser Clu 
S3 ° 535 540 
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Ala Leu Val Arg Val Leu Val Leu Asp Ala Asn Asp Asn Ser Pro Phe 
545 S50 555 5 6 o 



Val Leu Tyr Pro Leu Gin Asn Gly Ser Ala Pro Cys Thr Glu Leu Val 
565 570 S75 

Pro Arg Ala Ala Glu Pro Gly Tyr Leu Val Thr Lys Val Val Ala Val 
580 585 59 0 

Asp Gly Asp ser Gly Gin Asn Ala Trp Leu Ser Tyr Gin Leu Leu Lys 
595 600 605 

Ala Thr Glu Pro Gly Leu Phe Gly Val Trp Ala His Asn Gly Glu Val 
610 615 620 

Arg Thr Ala Arg Leu Leu Ser Glu Arg Asp Val Ala Lys His Arg Leu 
625 630 635 " 640 

Val Val Leu Val Lys Asp Asn Gly Glu Pro Pro Arg Ser Ala Thr Ala 
645 650 



655 



Thr Leu Gin Val Leu Leu Val Asp Gly Phe Ser Gin Pro Tyr Leu Pro 
660 665 670 

Leu Pro Glu Ala Ala Pro Ala Gin Ala Gin Ala Asp Ser Leu Thr Val 
675 680 685 

Tyr Leu Val Val Ala Leu Ala Ser Val Ser Ser Leu Phe Leu Phe Ser 

" u 695 700 

val Phe Leu Phe Val Ala Val Arg Leu Cys Arg Arg Ser Arg Ala Ala 
° 5 710 715 720 

ser Val Gly Arg Cys Ser Val Pro Glu Gly Pro Phe Pro Gly His Leu 
725 730 735 

Val Asp Val Ser Gly Thr Gly Thr Leu Ser Gin Ser Tyr Gin Tyr Glu 
740 745 750 

Val Cys Leu Thr Gly Gly Ser Glu Ser Asn Asp Phe Lys Phe Leu Lys 

Pro lie Phe Pro Asn He Val Ser Gin Asp Ser Arg Arg Lys Ser Glu 
" u 775 780 

Phe Leu Glu 
785 

(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3033 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



- 122 - 

( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 138.. 2528 



266 



314 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

GTGATTGGAC GTGTTTTTGT GACTATTTGG GAAGAAGACA CCTTCCTAAT CAGATTTACT 60 

CCAATATCTT CCCGGACCCT CATGAGTGCA TTGCAATTCA CTTGAAGAAG CAGCACCCTC 120 

AGGACTGAAT CTGAACA ATG GAG ACA GCA CTA CCA AAA ATA CCA GAG CAA l 70 

Met Glu Thr Ala Leu Ala Lys He Pro Gin Gin 
15 io 

AGG CAA GTC TTT TTT CTT ACT ATA TTG TCG TTA TTG TGG AAG TCT AGC 218 
Arg Gin Val Phe Phe Leu Thr He Leu Ser Leu Leu Trp Lys Ser Ser 
15 20 25 

TCT GAG GCC ATT AGA TAT TCC ATG CCA GAA GAA ACA GAG ACT GGC TAT 
Ser Glu Ala He Arg Tyr Ser Met Pro Glu Glu Thr Glu Ser Gly Tvr 
30 35 40 

ATG GTG GCT AAC CTG GCG AAA GAT CTG GGG ATC AGG GTT GGA GAA CTG 
Met Val Ala Aan Leu Ala Lys Asp Leu Gly He Arg Val Gly Glu Leu 
45 50 55 

TCC TCT AGA GGA GCT CAA ATC CAT TAC AAA GGA AAC AAA GAA CTT TTG 362 
Ser Ser Arg Gly Ala Gin He His Tyr Lys Gly Aan Lye Glu Leu Leu 
60 65 70 75 

CAG CTG GAT GCA GAG ACT GGC AAT TTG TTC TTA AAG GAA AAA CTA GAC 
Gin Leu Asp Ala Glu Thr Gly Asn Leu Phe Leu Lys Glu Lys Leu Asd 
80 85 90 

AGA GAA CTG CTC TGT GGA GAG ACA GAA CCC TGT GTG CTG AAC TTC CAG 458 
Arg Glu Leu Leu Cys Gly Glu Thr Glu Pro Cys Val Leu Asn Phe Gin 
9S 100 105 

ATC ATA CTG GAA AAC CCT ATG CAG TTC TTC CAA ACT GAA CTG CAG CTC 
lie He Leu Glu Asn Pro Met Gin Phe Phe Gin Thr Glu Leu Gin Leu 
11° 115 120 

ACA GAT ATA AAC GAC CAT TCT CCA GAG TTC CCC AAC AAG AAA ATG CTT 554 
Thr Asp He Asn Asp His Ser Pro Glu Phe Pro Asn Lys Lys Met Leu 
125 130 135 



410 



506 



602 



CTA ACA ATT CCT GAG ACT GCC CAT CCA GGG ACT GTG TTT CCT CTG AAG 
Leu Thr He Pro Glu Ser Ala His Pro Gly Thr Val Phe Pro Leu Lys 
140 145 150 i5 5 

a?* »? T f CG ? AC TCT GAC ATA 000 ACC GCT CTT CAG AAC TAC ACA 650 

Ala Ala Arg Asp Ser Asp He Gly Ser Asn Ala Val Gin Asn Tyr Thr 
160 165 170 

GTC AAT CCC AAC CTC CAT TTC CAC GTC GTT ACT CAC ACT CCC ACA GAT 698 
Val Asn Pro Asn Leu His Phe His Val Val Thr His Ser Arg Thr Asp 
17 5 180 185 
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CCC AGG AAA TAC CCA GAG CTG CTG CTG GAC AG A GCC CTG GAT AGC GAG 746 
Gly Arg Lye Tyr Pro Glu Leu Val Leu Aep Arg Ala Leu Asp Arg Glu 
190 195 200 

GAG CAG CCT GAG CTC ACT TTA ATC CTC ACT GCT CTG GAT GOT GGA GCT 794 
Glu Gin Pro Glu Leu Thr Leu He Leu Thr Ala Leu Asp Gly Gly Ala 
205 210 215 

CCT TCC AGC TCA GGA ACC ACC ACA GTT CAC ATA GAA GTT CTG GAC ATC 842 
Pro Ser Arg Ser Gly Thr Thr Thr Val His He Glu Val Val Asp He 
220 225 230 2 35 

AAT GAT AAC TCC CCC CAG TTT GTA CAG TCA CTC TAT AAG GTG GAA GTT 890 
Asn Asp Asn Ser Pro Gin Phe Val Gin Ser Leu Tyr Lys Val Gin Val 
240 245 250 

CCT GAG AAT AAT CCC CTC AAT GCC TTT GTT GTC ACC CTC TCT CCC ACC 938 
Pro Glu Aan Asn Pro Leu Asn Ala Phe Val Val Thr Val Ser Ala Thr 
255 260 265 

GAT TTA GAT GCT GGG GTA TAT GCC AAT GTG ACC TAT TCT CTG TTT CAA 986 
Asp Leu Asp Ala Gly Val Tyr Gly Asn Val Thr Tyr Ser Leu Phe Gin 
270 275 280 

GGG TAT GGG GTA TTT CAA CCA TTT GTA ATA GAC GAA ATC ACT GGA CAA 1034 
Gly Tyr Gly Val Phe Gin Pro Phe Val lie Asp Glu He Thr Gly Glu 
285 290 295 

ATC CAT CTG AGC AAA CAG CTG GAT TTT GAG GAA ATT AGC AAT CAT AAC 1082 
lie His Leu Ser Lys Glu Leu Asp Phe Glu Glu He Ser Asn His Asn 
300 305 310 3i S 

ATA GAA ATC GCA GCC ACA GAT GGA GGA GCC CTT TCA CCA AAA TGC ACT 1130 
He Glu He Ala Ala Thr Asp Gly Gly Gly Leu Ser Gly Lys Cys Thr 
320 325 330 

SI? ?, T ? 5 AG GT ? TTG CAT GTC *** GAC **C CCC CCA GAG TTG ACA 1178 

Val Ala Val Gin Val Leu Asp Val Asn Asp Asn Ala Pro Glu Leu Thr 
335 340 345 

ATT AGG AAG CTC ACA GTC CTG GTC CCA GAA AAT TCC GCA GAG ACT GTA 1226 
He Arg Lys Leu Thr Val Leu Val Pro Glu Asn Ser Ala Glu Thr Val 
350 355 360 

GTT GCT GTT TTT ACT CTT TCT GAT TCT GAT TCG GGG GAC AAT GGA AGG 1274 
Val Ala Val Phe Ser Val Ser Asp Ser Asp Ser Gly Asp Asn Gly Arg 
Jb5 370 375 

mI G 5 TG I° T I CT A F CCG AAC *** ATC CCA CTC CTC AAA CCC ACA 1322 

Met Val Cys Ser He Pro Asn Asn He Pro Phe Leu Leu Lys Pro Thr 
380 385 390 395 

III ^ G AAT I AT TAC ACG TTA GTC ACT GAG CGG CCA CTT GAT AGA GAG 1370 
Phe Glu Asn Tyr Tyr Thr Leu Val Thr Glu Gly Pro Leu Asp Arg Glu 
400 405 410 

AGA GCT GAG TAC ^ ATC ACC ATC ACG CTC TCA CAT CTG GGC ACA 1418 
Asn Arg Ala Glu Tyr Asn He Thr He Thr Val Ser Asp Leu Gly Thr 
415 420 425 
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CCC AGC CTC ACA ACC CAG CAC ACC ATA ACA CTC CAA CTC TCC GAC ATC 
Pro Arg Leu Thr Thr Gin His Thr He Thr Val Gin vS Ser As*p !S 

435 440 

AAC GAC AAC GCC CCT GCC TTC ACC CAA ACC TCC TAC ACC ATG TTT GTC 
Asn Asp Asn Ala Pro Ala Phe Thr Gin Thr . Ser Tyr !g S ?5 



455 



SIS 520 

Si! S £ a S SS 52 Sg S a a SJ g; a £ S »" 

530 535 



1466 



1514 



CAC GAG AAC AAC AGC CCC GCC CTC CAC ATA GGC Arc irr a™ 
Hi. Glu Asn Aen Ser Pro Ala Leu hJs Si £J J£ J2 J£ *g 

470 475 

GAC TCA GAC TCA GGC TCC AAT CCC CAC ATC ACC TAC TCC CTC CTC CCC 
Asp ser Asp Ser Gly Ser Aan Ala Hie lie Thr Tyr sS 2 2S pS 
480 48 5 490 



1610 



1658 



CCT GAT GAC CCG CAG CTG GCC CTC GAC TCA CTC ATC TCC ATC a.t 
Pro Asp Asp Pro Gin Leu Ala Leu Asp Ser SS J5 2? i2 v" 

500 5q 5 

a a a a 2 a a a a s si - - - s s 1M . 



1802 



1850 



1946 



CTC AGC AGC CAG ACT CTG GTG CGG ATG GTG CTC CTG GAT GAC AAT CAC 
Leu Ser Ser Gin Thr Leu Val Arc, Met Val Val lZ Ae*p 2J J£ £p 

S4S 550 S5S 

A^n PrS J2 21? f T ° ^ AC CCA CTG CAG GCC TCA GCA CCC TGT 

Asn Ala Pro Phe Val Leu Tyr Pro Leu Gin Asn Ala Ser Ala Pro Cys 

565 57 o 

S SS 2 2 a a S 22 a 2 ST? a 2 J2 £ £ 

580 535 

55 SS S 52 !E f GC TCT *** CCT «* CTG TCG TTC 

Val val Ala Val Asp Arg Asp Ser Gly Gin Asn Ala Trp Leu Ser Se 

595 600 

a a s k s a a a a a a a a a a a 
a a a a a a a a 2 a a £ a - s - 

625 630 635 

a a a 2 2 2 a a a a a a a 2 a 

645 650 

a a a a a 2 a a a a a a a a a a 

660 655 



1994 



2042 



2090 



2138 
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!5 S? Si S 2 E K S S 5 2? X K 5 S 

675 680 

s? 5 si s s s g si s s s ss a = a s - 

695 

s s s s s s si s? s s s? ss s is a »« 

710 715 
AGG AGG GCC AGG GAG GCC TCC TTC ctt ~ 

*r 9 „. ^ „ Al . ™ ™ «* «» »C TCT OTO C=T CAC CCA 23 ,0 

725 730 
CAC TTT CCT AGC CAC TTG GTG GAT GTC AGC err r>nn .„ 

ph - p <° ?s Hi - ^ »" s? ss tr. m ?S5 ss s 



740 745 



SS = £ S S K S S S S 5 5 S S3 iS K 



760 



s {E E J2 S £ S 2 5 S S S s s s s 

s s s s: s: £ s: an ss sit s £ k s 5 sj 

790 795 
K TAGAGCACTG ATTTTGAAGT GGTGGTTACC TCATTTTTCC TTAACTATCC 



CTGATCTACA ATGGTCTACT GCCGTGAATC AACXCCTGAG ATATATG TTC ATTTTATCCT 
TTGTTTTGAA TCAAACTATT CAGATGTGAT CCTACTCTAG AGAATTTGGT TCTACTCCAT 
TGTGTTTGTT TAGATTTCTA CGCCATACCA GTGCATGCTG GGTTGTTTTT TTTTTTACAA 
TTATTATAAC TTTGCTTTGG AGGGGAACTC ATATTCGCTG TAACGAATTG GAACCACTTT 
CATTGTTAGA GATGCCTTGC TTTGTTGTCT TATTTCAGAC AGGGTCTTAA ATTGTAGCCC 
TGGGTGACCT GAAATGACTA TGTACAGACT GACTTTGAAT TTGTGGCAGT CCATCTGCCT 
CTGTTGTCCT ATGTTGGGAT TGTGAGCATG CATGAGTAGG CTCAGCTGTG GTGAGCGACC 
TTAATAAAAA TCAAATACTA AAAAAAAAAA AAAAA 

(2) INFORMATION FOR SEQ ID NO: 112 : 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 797 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



2426 

2474 

2522 

2578 

2638 
2698 
2758 
2818 

2878 

2938 

2998 

3033 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

Met Glu Thr Ala Leu Ala Lye lie Pro Gin Gin Arg Gin Val Phe Phe 

10 15 

Leu Thr lie Leu Ser Leu Leu Trp Lys Ser Ser Ser Glu Ala He Arg 

Tyr Ser Met Pro Glu Glu Thr Glu Ser Gly Tyr Met Val Ala Asn Leu 

Ala Lye Aep Leu Gly He Arg Val Gly Glu Leu Ser Ser Arg Gly Ala 

" 60 
Gin lie His Tyr Lys Gly Asn Lys Glu Leu Leu Gin Leu Asp Ala Glu 

75 80 
Thr Gly Asn Leu Phe Leu Lys Glu Lys Leu Asp Arg Glu Leu Leu Cy. 



90 95 



Gly Glu Thr Glu Pro Cys Val Leu Asn Phe Gin He lie Leu Glu Asn 

105 110 
Pro Met Gin Phe Phe Gin Thr Glu Leu Gin Leu Thr Asp He Asn Asp 

His ser Pro Glu Phe Pro Asn Lys Lys Met Leu Leu Thr He Pro Glu 

" 5 140 
Ser Ala His Pro Gly Thr Val Phe Pro Leu Lys Ala Ala Arg Asp Ser 

155 160 
Asp He Gly ser Asn Ala Val Gin Asn Tyr Thr Val Asn Pro Asn Leu 

170 1?5 

His Phe His Val val Thr His Ser Arg Thr Asp Gly Arg Lys Tyr Pro 



190 



Glu Leu Val Leu Asp Arg Ala Leu Asp Arg Glu Glu Gin Pro Glu Leu 

200 205 
Thr Leu He Leu Thr Ala Leu Asp Gly Gly Ala Pro Ser Arg Ser Gly 

215 220 1 

Thr T hr Thr V.X H1 . „ olu w vtl A . p A>p pr[> 

235 240 

Gin Phe Val Gin Ser Leu Tyr Lvb VaI pi« v.i « 

245 S n al Pro Glu Aan Asn Pro 

250 255 

Leu Asn Ala Phe Val Val Thr Val ,„ A1 a Thr Asp Leu Asp Ala Gly 



270 



Val Tyr Gly Asn Val Thr Tyr Ser Leu Phe Gin Gly Tyr Gly Val Phe 

280 285 
Gin Pro Phe Val He Asp Glu lie Thr Gly Glu He His Leu Ser Lys 

295 300 
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Glu Leu Asp Phe Glu Glu He Ser Asn His Asn lie Clu He Ala Ala 

315 320 
Thr Asp Cly Cly Gly Leu Ser Cly Lys Cys Thr Val Ala Val Gin V.l 

335 

Leu Asp val Asn Asp Asn Ala Pro Glu Leu Thr He Arg Lye Leu Thr 

Val Leu Val Pro Glu Asn Ser Ala Clu Thr Val Val Ala Val Phe Ser 

360 365 
Val ser Asp Ser Asp Ser Gly Asp Asn Gly Arg Met Val Cy. Ser He 

375 380 

III Ile Pr ° %° 0 Leu Leu »«o Thr Phe Clu Asn Tyr Tyr 

J55 



400 



Thr Leu Val Thr Glu Gly Pro Leu Asp Arg Clu Asn Arg Ala Glu Tyr 

410 415 

Asn He Thr Xle Thr Val Ser Asp Leu Gly Thr Pro Arg Leu Thr Thr 

425 430 
Gin His Thr He Thr Val cin Val Ser Asp He Asn Asp Asn Ala Pro 

u 445 
Ala Phe Thr cm Thr Ser Tyr Thr Met Phe Val His Glu Asn Asn Ser 

455 460 
Pro Ala Leu Hia lie Glv Thr Tic* » 1m 

465 Tftr Ile Ser Ala Thr Asp Ser Asp Ser Cly 



475 



480 



ser Asn Ala His He Thr Tyr Ser Leu Leu Pro Pro Asp Asp Pro Gin 

Leu Ala Leu Asp Ser Leu He Ser lie Asn Val Asp Asn Gly oL Leu 

505 510 



Phe Ala Leu Arg Ala Leu Asp Tyr Clu Ala Leu Gin Ser Phe Glu Phe 

520 

Tyr Val cly Ala Thr Asp Gly Gly Ser Pro Ala Leu Ser Ser Gin Thr 

535 540 

5« V<11 AX9 Val IS L8U AB P A°P Asn Asp Asn Ala Pro Phe V.l 

555 560 
Leu Tyr Pro Leu Gin Asn Ala Ser Ala Pro Cys Thr Glu Leu Leu Pro 

570 575 

Arg Ala Ala clu Pro Cly Tyr Leu lie Thr Lys Val Val Ala Val Aep 

585 590 
Arg Asp ser Gly cin Asn Ala Trp Leu Ser Phe Cln Leu Leu Ly. Ala 

600 605 
Thr Glu Pro Gly Leu Phe Ser Val Trp Ala His Asn Gly Glu Val Arg 

615 620 * 
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Thr Thr Arg Leu Leu Ser Clu Arg Asp Ala Gin Lys His Ly» Leu Leu 

635 640 
Leu Leu Val Lys Asp Asn Cly Asp Pro Leu Arg Ser Ala Asn Val Thr 

645 650 6S5 

Leu His Val Leu Val Val Asp Gly Phe Ser Gin Pro Tyr Leu Pro Leu 



670 



Ala Glu val Ala Gin Asp Ser Met Gin Asp Asn Tyr Asp Val Leu Thr 

685 

Leu Tyr Leu Val He Ala Leu Ala Ser Val Ser Ser Leu Phe Leu Leu 
Ser Val Val Leu Phe Val Gly Val Arg Leu Cy. Arg Arg Ala Arg Glu 



715 



720 



Ala ser Leu Gly Asp Tyr Ser Val Pro Glu Gly Hi. Phe Pro Ser His 

25 730 735 

Leu Val Asp val Ser Gly Ala Gly Thr Leu Ser Gin Ser Tyr Gin Tyr 

/45 750 
Glu val cys Leu Asn Gly Gly Thr Arg Thr Asn Glu Phe Asn Phe Leu 

760 765 
Lys Pro Leu Phe Pro He Leu Pro Thr Gin Ala Ala Ala Ala Glu Glu 

//5 780 

Arg Glu Asn Ala Val Val His Asn Ser Val Gly Phe Tyr 

790 795 

(2) INFORMATION FOR SEQ ID NO: 113 : 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 2347 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:113: 



AAAACACGGG 


GGAAATGACA 


GTAGCAAAGA 


ATCTGGACTA 


TGAAGAATGC 


TCATTGTATG 


60 


AAATGGAAAT 


ACAGGCTGAA 


GATGTGGGGG 


CGCTTCTGGG 


GAGGAGCAAA 


GTGGTAATTA 


120 


TGGTAGAAGA 


TGTAAATGAC 


AATCGGCCAG 


AAGTGACCAT 


TACATCCTTG 


TTTAACCCGG 


180 


TATTGGAAAA 


TTCTCTTCCC 


CGGACAGTAA 


TTGCCTTCTT 


GAATCTCCAT 


GACCGAGACT 


240 


CTGGAAAGAA 


CGGCCAAGTT 


GTCTGTTACA 


CGCATGATAA 


CTTACCTTTT 


AAATTAGAAA 


300 


AGTCAATAGA 


TAATTATTAT 


AGATTGGTGA 


CATGGAAATA 


TTTGGACCGA 


GAAAAAGTCT 


360 


CCATCTACAA 


TATCACAGTG 


ATAGCCTCAG 


ATCTAGGAGC 


CCACTCTGTC 


ACTGAAACTT 


420 
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ACATTGCCCT GATTGTCCCA GACACTAATG ACAACCCTCC TCCTTTTCCT CACACCTCCT 
ACACAGCCTA TATTCCAGAG AACAACCTCA GGGGCGCCTC CATCTTCTCA CTGACTGCAC 
ATGATCCTGA CAGTCAGGAA AATGCACAGG TCACTTACTC TGTGTCTGAG GACACCATAC 
AGGGAGTGCC TTTGTCCTCT TATATCTCCA TCAACTCAGA TACTGCTGTC CTG TATGCAC 
TGCACTCTTT TGACTTCGAG AAGATACAAG ACTTGCAGCT ACTGCTTGTT GCCACTGACA 
GTGGAAGCCC ACCTCTCAGC AGCAATGTGT CATTGAGCTT GTTTGTGTTG CACCAGAACG 
ACAACGCACC TGAGATTCTA TATCCTAGCT TCCCCACAGA TGGCTCCACT GGTGTGGAAC 
TAGCACCCCG CTCTGCAGAG CCTGGATACC TAG TG ACCAA AGTGCTGGCA GTGGACAAAG 
ACTCAGGACA GAATGCTTGG CTGTCCTACC GTCTGCTGAA GGCCAGCGAA CCTGGGCTCT 
TCTCTGTAGG ACTTCACACG GGTGAGGTGC GTACAGCGAG GGCCCTGCTG GACAGAGATG 
CTCTCAAACA GAATCTGGTG ATGGCCGTGC AGGACCATGG CCAACCCCCT CTCTCGCCCA 
CTGTAACTCT CACTGTGGCA GTGGCTAACA GCATCCCTGA GGTGTTGGCT GACTTGAGCA 
GCATTAGGAC CCCTGGGGTA C C AG AGG ATT CTGATATCAC GCTCCACCTG GTGGTGGCAG 
TGGCTGTGGT CTCCTGTGTC TTCCTTGTCT TTGTCATTGT CCTCCTAGCT CTCAGCCTTC 
AGCGCTGGCA GAAGTCTCGC CAGCTCCAGG CCTCCAAAGG TGGATTGGCT CCTGCACCTC 
CATCACATTT TGTGGGCATC GACGGGGTAC AGGCTTTTCT ACAAACCTAT TCTCATGAAG 
TCTCGCTCAC TTCAGGCTCC CAGACAAGCC ACATTATCTT TCCTCAGCCC AACTATGCAG 
ACATGCTCAT TAACCAAGAA GGCTGTGAGA AAAATGATTC CTTATTAACA TCCATAGATT 
TTCATGAGAG TAACCGTGAA GATGCTTGCG CCCCGCAAGC CCCGCCCAAC ACTGACTGGC 
GTTTCTCTCA AGCCCAGAGA CCCGGCACGA GCGGATCCCA AAATGGGGAT GAAACCGGCA 
CCTGGCCCAA CAACCAGTTC GATACAGAGA TGCTGCAAGC CATGATCTTG GCCTCTGCCA 
GTGAAGCCGC TGATGGGAGC TCCACTCTGG GAGGGGGCAC TGGCACTATG GGTTTGAGCG 
CTCGATATGG ACCCCAGTTT ACCCTGCAGC ACGTGCCTGA CTACCGCCAG AACGTGTACA 
TCCCTGGCAG CAATGCCACA CTGACCAACG CAGCTGGCAA ACGAGATGGC AAGGCTCCGG 
CAGGCGGCAA TGCCAACAAC AACAAGTCGG GCAAGAAAGA GAAGAAGTAA TATGGAGGCC 
AGGCCTTGAG CCACAGGGCA GCCTCCCTCC CCAGCCAGTC CAGCTTGTCC TTACTTGTAC 
CCAGGCCTCA GAATTTCAGG GCTCACCCCA GGATTCTGGT AGGAGCCACA GCCAGCCCAT 
CCTCCCCGTT GGCAAACAGA AACAAGTGCC CAAGCCAACA CCCCCTCTTT GTACCCTAGG 
GGGGTTGAAT ATGCAAAGAG AGTTCTGCTG GGACCCCCTA TCCAATCAGT GATTGTACCC 
ACAT AGG TAG CAGGGTTAGT GTGGATACAC ACACACACAC ACACACACAC ACACACACAA 
CCCTTGTCCT CCGCAGTGCC TGCCACTTTC TGGGACTTTC TCATCCCCCT ACGCCCTTCC 



480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
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TTTATCCTCT CCCACCCAGA CACAGCTCCT GGAGAATAAA TTTGCGGATC CTCATCCTAA 
AAAAAAA 

(2) INFORMATION FOR SEQ ID NO: 114 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2972 base paire 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 2*. 1849 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114 • 

* 2 S 2 2 2 2! "? 2 2 2 2 2 S? 2 £ 
2 2 2 IS 2 £ 52 S| S SS 2 51? 2 22 
2 2! £ 2 2 2 2 2 2 2 2 2 25 2 2 52 

40 A* 



2: 2 2 2 2 2 2 2 2S 2! 25 2 2 £ 2 2 

55 60 
GAC CCG GAT GAG GGA ATC AAr rva m 

»=p ,. P «. « y 2 2 2 £ 2 2 « 2 S «« - 

70 75 



90 95 



2340 
2347 



46 



94 



142 



190 



238 



2 2 2 5: 2 s 2 2; s 2 s 2 2 2 s 2 - 



2 2 2 2 2 2 2 2i 2 2 2 2 2 2 2 2 

105 110 

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ~ 
£ 2 2 2 2 2 2 2 « 2 2 2 2 2 2 2 « 

135 140 

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 - 



155 



131 



£ 52 £ S S 2 a 51? S5 a a a 2 SS £ 2 

170 175 

CGT CAA GTT GTC TGT TAC ACA CGT GAT AAT TTA CCT TTT &&& ™ „„„ 
Cly Gin val Val Cye Tyr Thr Ar 9 Asp p" J£ Si gJJ 

185 190 

S 2 S jg 2 ^ a K g » - - » « s « 

£ Sii 2 51? 2 2 2 2 2 2 51? is K 2 2 2 •» 
S 2 2 2 = 2 £ K S 2 S S 2 his 51? a ™ 

235 



a s a a a a a a a a s s a a a a 

250 255 



2 a 2 a a a a a a a a a a is a 2 

360 3fi c 



K S Sg 5! SS S S So I cc S* °? c 001 GGC ™ «» 

370 . Ser A1 » cl « Arg Gly Tyr Leu 

375 380 

55 K J£ JS ?S !2 51? 2S £ CA GAC TCG 000 ^ c too 

385 7 1 Ala JiJ AS P Ar * Afl P s « Gin Aon Ala Trp 



526 



574 



622 



766 



814 



862 



910 



a 2 a a a a a a 2 a a a a a a a 

265 270 

s a a a a a a a a a a a a a 2 a 

280 285 

s 2 a a a a a a a a a a a s a a 

295 300 
AAC TCT GAC ACC CGT GTC CTG TAT GCC r-rr r» 

Asn Ser Asp Thr Gly Val Leu ?ir Ala SS IS! £° * AC TAT GAG *** 

305 31 q y AXa Leu Gln Ser p he Asp Tyr Glu 

a a a 2 2 a a a si? a a a a a a 2? - 

330 335 

a a a a a a s a a a a a a a a a 

345 



1054 



1102 



1150 



1198 



- 132 - 



CTC TCC TAC CGC CTC CTC AAC CCC AGC GAG CCG GGA CTC TTC TCG GTC 
Leu Ser Tyr Arg Leu Leu Lys Ala Ser Glu Pro Gly Leu Phe Ser Val 
400 405 410 415 

GGT CTG CAC ACG GGC GAG GTG CGC ACG GCG CGA GCC CTG CTG GAC ACA 
Gly Leu Hie Thr Gly Glu Val Arg Thr Ala Arg Ala Leu Leu Asp J?J 
420 425 430 

GAC GCG CTC AAG CAG AGC CTC GTG GTG GCC GTC CAG CAC CAT CGC CAC 
Asp Ala Leu Lye Gin Ser Leu Val Val Ala Val Gin Asp Hi. Gly Gin 
" 5 440 445 

CCC CCT CTC TCC GCC ACT GTC ACG CTC ACC GTA GCC GTG OCT GAC AGC 
Pro Pro Leu Ser Ala Thr Val Thr Leu Thr Val Ala Val Ala Asp 5« 

455 46O 

ATC CCC GAA CTC CTG ACC GAC TTG GGC ACT CTG AAG CCT TCC CTC GAC 
lie Pro Glu val Leu Thr Glu Leu Gly Ser Leu Lys Pro Ser Sal i£p 
<lt,s 470 475 * 

CCG AAC GAT TCG ACC CTT ACA CTC TAT CTC GTC GTG CCA CTG OCT GCC 
Pro Asn Asp Ser Ser Leu Thr Leu Tyr Leu Val Val Ala Val Ala Ala 
* UU 485 490 495 

ATC TCC TGT GTC TTC CTC GCC TTT GTC OCT GTG CTT CTG CGG CTC AGC 
He ser Cys Val Phe Leu Ala Phe Val Ala Val Leu Leu Gly Su Art 
500 505 510 

CTG AGG CGC TGG CAC AAG TCA CGC CTC CTC CAG GAT TCC GGT GCC AGA 
Leu Arg Arg Trp His Lys Ser Arg Leu Leu Gin Asp Ser Cly Gly Arg 
515 520 525 9 

Jl G f, T ? CCT CCC TCA ^ m GTC GTT GAG GAG GTA CAC 

Leu Val Gly Val Pro Ala Ser His Phe Val Gly Val Glu Glu Val Gin 
u 535 540 

OCT TTC CTG CAG ACC TAT TCC CAG GAA CTC TCC CTC ACC GCC GAC TCC 
Ala Phe Leu Gin Thr Tyr Ser Gin Glu Val Ser T*nr 35 ler 

3 550 555 

CGG AAG ACT CAC CTG ATC TTT CCC CAG CCC AAC TAC CCA GAC ATG CTC 
Arg Lys Ser His Leu lie Phe Pro Gin Pro Asn Tyr All 2? 22 25 
"° 565 570 575 

ATC ACT CAG GAG GGC TGT GAG AAA AAT GAT TCT TTG TTA ACA TCC GTA 
He Ser Gin Glu Gly Cya Glu Lys Asn Asp Ser Leu ilu J£ ser vll 
580 585 590 

GAT TTT CAT GAA TAT AAG AAT GAA GCT GAT CAT GGT CAG GTG ACT TTA 
Asp Phe His Glu Tyr Lys Asn Glu Ala Asp His Gly Gin vll Ser 22 
" 5 600 60S 

GTT CTT TGC TTG CTT TTA ATT TCC AGA TGAATTTTAT TTGGCATAAA 
Val Leu Cys Leu Leu Leu He Ser Aro 

TTATGTTTTG AAAAACATTG TGAAGATAGT TGAAAATAAT TTTTAAGGTG TATCACAGAG 
TTTTGGGTTT ATTTTGGTGG TGTTACCAAA AAATTGAACT CTAATAGTCA TAGGTTATTG 
TTTCATTTGC TTTTAAACGA CTTGGAAAAG ATTCTTCCAC CATTTTAAAC CTTCCAGTAT 



1246 

1294 

1342 

1390 

1438 

1486 

1534 

1582 

1630 

1678 

1726 

1774 

1822 

1869 

1929 
1989 
2049 



TTTATTCCTA TTATCACTCA 
CCTATTGTTT GTTTGTGTGT 
AACTTCAGAA AATTATCAAG 
TATCTCAGAA TTTTTAGGGT 
TTCTTCACTT TAAACCTCTT 
AATACTTCTT ACCATCCTTC 
TGAGACTGGT TACTAAATAT 
GGAGACATGA AATCTAAAGC 
AAAACATTAG ATCTGAATTA 
ATGTAAGAAC ATATTTCAAT 
GTGGTGAGAA TGTTGATATT 
ATCAATTCAA TTAAAGTTAT 
AGCACTTTGG GAGGCTGGGG 
GAGCTATGAT CATGCCAGTG 
AAAACTATTA TTAGGCCGCG 
TGAGGTGGGT GGATCACCTG 



TTCACTTAAG 
GTGTGTGTGT 
AAGTCTAAAG 
TATGTTTAGC 
TTCTGAGCCC 
AAAACATGAA 
TAAGTATCTG 
CTAGAATGTC 
AAATGTAATT 
ACAATTCCAA 
AAGAACCAAT 
TCAGTCTTGG 
CAGGAGGACC 
CACTCCAGCC 
TGCGGTGGCT 
AGC 
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AAGTAGCTAC 
GTGTGTGTGT 
CCTTGTTATT 
ATTTGAACCT 
TGTTTCTGTA 
CAAACTTTAA 
AGTCAGTGGT 
CATTGCTCCC 
TTAAACTGTT 
TTAGCTG TTT 
GTTTCAGGTA 
CTGGACACAG 
GCTTGAGCCC 
TAGGTGGCAG 
CACGCCTGTA 



CCGTCCATAC 
GTGTGTGTAT 
AGCTTAGCAA 
GTAACTAGGC 
CCAGTGCCCT 
AGATGGATCT 
CACCTGGGCT 
CCAAACAAAA 
GAAAGTGACT 
CGGTTGTGCA 
CACAAGTTCT 
TGCCTCATGT 
CGGGGGTTTG 
AACTAGACCC 
ATCCCAGCAC 



TGGTAATTTT 
CCCAAACTAG 
AAGTAAAATA 
TCTTGTATAT 
TCAAAACTTT 
TGGTGGGAGA 
CCATCCCCAT 
AACAAAAGCA 
TTTGTAAAAT 
TTGATGTGAA 
AAATAAGCTG 
CTGAAATCCC 
AAACTGCAGT 
TGTCTCTAAA 
TTTGGGAGAC 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 616 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:115: 

Glu Ala Ala His His Leu Val Lbu th»- * 

1 i-eu vax Leu Thr Ala Ser Asp Gly Gly Lys Pro 

10 15 
Pro Arg Ser Ser Thr Val Arc, He His Val Thr Val Leu Asp Thr Asn 

25 30 

Asp Asn Ala Pro Val Phe Pro His Pro He Tyr Arg Val Ly. Val Leu 

40 45 

Glu Asn Met Pro Pro Gly Thr Arg Leu Leu Thr Val Thr Ala Ser Asp 
Pro Asp Glu Gly He Asn Gly Lys Val Ala Tyr Lys Phe Arg Ly. He 



2109 

2169 

2229 

2289 

2349 

2409 

2469 

2529 

2589 

2649 

2709 

2769 

2829 

2889 

2949 

2972 
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Asn Clu Lys Gin Thr Pro Leu Phe Gin Leu A.n Clu Aan Thr Gly Clu 
85 90 9 | 

lie Ser He Ala Lys Ser Leu Asp Tyr clu Glu Cya ser Phe Tyr Glu 

105 HQ 

Met Glu lie Gin Ala Glu Asp Val Gly Ala Leu Leu Gly Arg Thr Lye 

Leu Leu He Ser Val Glu Asp Val Asn Asp Asn Arg Pro Glu Val He 

135 140 

lie Thr Ser Leu Phe Ser Pro Val Leu Glu Asn Ser Leu Pro Gly Thr 

150 155 160 

Val lie Ala Phe Leu Ser Val His Asp Gin Asp Ser Gly Lys Asn Gly 

Gin val Val Cys Tyr Thr Arg Asp Asn Leu Pro Phe Ly. Leu Glu Ly. 

185 190 
ser lie Gly Asn Tyr Tyr Arg Leu Val Thr Arg Lys Tyr Leu Asp Arg 

Clu Asn Val Ser He Tyr Asn He Thr Val Met Ala Ser Asp Leu Gly 

Thr Pro Pro Leu Ser Thr Glu Thr Gin He Ala Leu His Val Ala Asp 

230 235 240 

He Asn Asp Asn Pro Pro Thr Phe Pro His Ala Ser Tyr Ser Ala Tyr 

45 250 255 

He Leu Glu Asn Asn Leu Arg Gly Ala Ser He Phe Ser Leu Thr Ala 

255 270 

His Asp Pro Asp Ser Gin Clu Asn Ala Gin Val Thr Tyr Ser Val Thr 

280 285 

Clu Asp Thr Leu Gin Cly Ala Pro Leu Ser Ser Tyr He Ser He Asn 

300 

ser Asp Thr Gly Val Leu Tyr Ala Leu Gin Ser Phe Asp Tyr Glu Gin 

315 320 
He Arg Asp Leu Gin Leu Leu Val Thr Ala Ser Asp Ser Gly Asp Pro 

Pro Leu Ser Ser Asn Met Ser Leu Ser Leu Phe Val Leu Asp Gin Asn 

Asp Asn Ala Pro Clu He Leu Tyr Pro Ala Leu Pro Thr Asp Gly Ser 

360 365 
Thr Gly val Glu Leu Ala Pro Arg Ser Ala Glu Arg Gly Tyr Leu Val 

375 380 

385 A1 * J}J Afl P ^ *«P Cly Gin Aan Ala Trp Leu 



395 



400 
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Ser Tyr Arg Leu Leu Lys Ala Ser Clu Pro Cly Leu Phe Ser Val Cly 

410 415 

Leu His Thr Gly Glu Val Arg Thr Ala Arg Ala Leu Leu Aep Arg Asp 

425 430 
Ala Leu Lys Cl„ ser Leu Val Val Ala Val Cl„ Asp HI. Cly Gin Pro 

440 445 
Pro Leu Ser Ala Thr Val Thr Leu Thr Val Ala Val Ala Asp Ser lie 

Pro Glu val Leu Thr Glu Leu Gly Ser Leu Lys Pro Ser Val Asp Pro 



480 



Asn Asp ser Ser Leu Thr Leu Tyr Leu Val Val Ala Val Ala Ala lie 

Ser cys Val Phe Leu Ala Phe Val Ala Val Leu Leu Gly Leu A^g Leu 

505 510 

Arg Arg Trp His Lys Ser Arg Leu Leu Gin Asp Ser Gly Gly Arg Leu 
Val Gly val Pro Ala Ser His Phe Val Gly Val Glu Glu Val Gin Ala 



540 



Phe Leu Gin Thr Tyr Ser Gin Glu Val Ser Leu Thr Ala Asp Ser Arg 

555 



560 



«*. s« Hi. L . u n. Ph . Pro 01n Pro ^ A1> i>p M<t 

570 575 
ser Gin Glu Gly cys Glu Lys Asn Asp Ser Leu Leu Thr Ser Val Asp 

585 590 
Phe His Glu Tyr Lys Asn Glu Ala Asp His Gly Gin Val ser Leu Val 



600 

Leu Cys Leu Leu Leu He Ser Arg 
610 615 



605 



